TEHAARHZ RREME 

Probability Concepts 

in Engineering 

Planning and Design 

VOLUME I BASIC PRINCIPLES 
4—8 BERME 

Alfredo H-S. Ang l 


Professor of Civil Engineering 
University of Illinois at Urbana-Champaign 


Wilson H. Tang 


Associate Professor of Civil Engineering 
University of Illinois at Urbana-Champaign 


contents 


1. Role of Probability in Engineering 
Ll. Introduction........ Vai segs Aes "— NETS 


1.2. Uncertainty in Real-World Information...... C" F 


1.2.1. Uncertainty Associated "with Randomness..........- à 


1.2.2. Uncertainty Associated with Imperfect Modeling and 
Estimation.......eeeeeeeeee enn een nnn 


1.8. Design and Decision-Making Under Uncertainty............ 
1.3.1. Planning and Design of Airport Pavement.......... 
1.3.0. Hydrologic Design. ...... e 

. 1.8.8. Design of Structures and Machines..............-++ 
1.3.4. Geotechnical Design........... een 
1.3.5. Construction Planning and Management............ 
1.3.6. Photogrammetric, Geodetic, and Surveying 

Measurements. .......0. 0000s eee nmt 


14. Control and Standards. Viu mardi APT ES CM 
1.5. Concluding Remarks. ........ eene n 


2. Basic Probability Concepts 


2.1. Events and Probability... .......60- eee eee een Li 


2.1.1. Characteristics of Probability Problems Fe 
2.1.2. Calculation of Probability.......... aged Ae RR a 


. 2.3. Elements of Set Theory...... ee ME 
2.2.1. Definitions. .......... eee Meese EEAS 
2.2.2. Combination of Events i 
2.2.3. Operational Rules. I" d 


2.3. Mathematics of Probability. ....... de redis Ducat mkt DRE 
^ 2.8.1. Basic Axioms of Probability; Addition Rule......... 
2.3.2. Conditional Probability; Multiplication Rule. ....... 


= CONTENTS 


2.3.8. Theorem of Total Probability... a. 52 
2.3.4. Bayes’ Theorem... o Lu. 56 
2-4. Concluding Remarks...) 60 
PHO MRS oeste trea CREE e TA enc edo 60 
8. Analytical Models of Random Phenomena 
3.1, Random Variables. |... 80 
3.11. Probability Distribution of a Random Variable...... 81 
3.1.2. Main Descriptors of a Random Variable... 2.0.02... 87 
3.2. Useful Probability Distributions. ......................... 97 
S 3.2.1. The Normal Distribution. |... 97 
3.2.2. The Logarithmic Normal Distribution. ............. 102 
3.2.3. Bernoulli Sequence and the Binomial Distribution... 106 
3.2.4. The Geometric Distribution. a... aLaaa. 110 
3.2.5. The Negative Binomial Distribution. ............... 113 
.9.2.6. The Poisson Process and Poisson Distribution... .... 114 
3.2.7. ` The Exponential Distribution E en a RA ae, let 120 
3.2.8. The Gamma Distribution. ................. 01. ` 124 
3.2.9. The Hypergeometric Distribution 
3.2.10. 
3.2.11. 


3.3. Multiple Random Variables 


4 F unctions of Random Variables. 


41. Introduction......2.... aaa Ed 170 
42. Derived Probability Distributions................ eta 170 
4.2.1. Function of Single Random Variablé............... 170 
4.2.2, . Function of Multiple Random Variables............ 174 
4.3. Moments of Functions of Random Variables...:............ 191 
++. ABL Introduction... usus usus ss sun 191 
43.2. Mean and Variance of a Linear Function............ 191 


CONTENTS xi 


isle ee Sy da eee ere Ss - 
.3.3. Product of Independent Variates, ; 
7 3.4. Mean and Variance of a General Function 


TN 202 
44. Concluding Remarks. ...... 66 "a 
Problems...... een theme DUE e 
5. Estimating Parameters from Observational.. 
Data " " 
isti in Engineering ...........- 
he Role of Statistical Inference in Engir a 
pus "n Inherent Variability and Estimation Error... . ee 1 
ds à .. 221 
5.2. Classical Approach to Estimation of Parameters Ane. a 
B 5.2.1. Random Sampling and Point Estimation...........- 2a 
5.2.2. Interval Estimation of the Mean........... s us 
5.2.3. Problems of Measurement Theory. Enean m 
5.2.4. Interval Estimation of the Variance. ..... sees : 2 
5.2.5. Estimation of Proportion... ee pee m 
5.3. Concluding Remarks. ...........-- fea naa xe es A 
Problems..........0c00c cece Im emm LIE 
6. Empirical Determination of Distribution 
Models - 
6.1. Introduction.. Meere Se alae aie age ces d.t j Deote a 
6.2. Probability Paper......... ME DL MEN xi 
6.2.1. The Normal Probability Paper Vna vius on 
6.2,2. The Log-Normal Probability Paper 2 
6.2.3. Construction of General Probability Paper. ........- 
s istributi E e 274 
6.3. "Testing Validity of Assumed Distribution EEEREN S; 
6.3.1. Chi-Square Test for Distribution. ertet in 
6.3.2. Kolmogorov-Smirnov Test for Distribution. ..:.....- 
iw j 281 
6.4. Concluding Remarks. . es 
Problems......... ssec eee) 
Te- Regression and Correlation Analyses 
i .. 986. 
7.1. Basic Formulation of Linear Regression sad ais tiam t S de olt un 
7.1.1. Regression with Constant Variance 3 n 


7.1.2. Regression with Nonconstant Variance:...:..:.. 


PEN 
T Multiple Linear Regression... ........ lisse 297 
-7.3 Nonlinear Regression. ........ Bed doo y d oa Lb beg 300 
T4. Applications of Regression Analysis in Engineering...... n... 807 
7.5. Correlation Analysis....:......... sss rave yes -. 815 
. 7.5.1, Estimation of Correlation Coefficient............... 315 
7.6. Concluding Remarks. ......... NE dee eed e Me 319 
Problems: sib p d oM I ud I LE 319 
8. The Bayesian Approach 
8.1. Introduction................. dead e Ta M Da iere tal dt 329 
8.2. Basic Concepts—The Discrete Case......:...... EH 330 
8.3. The Continuous Case....... DOR NDERIT ates ao ee 336 
8.3.1. General Formulation... looa aaa 336 
8.3.2. A Special Application of Bayesian Up-dating Process.. 341 
8.4. Bayesian Concepts in Sampling Theory.................... 344 
i 8.4.1. General Formulation. .......0..00... 0000.0 auaa 344 
_ 8.4.2. Sampling from Normal Population. ................ 345 
8.4.3. Error in Estimation. ....... a... aL Laa Pa La. 347 
8.4.4. Use of Conjugate Distributions:.................... 351 
8.5. Concluding Remarks... luus 354 
Probléms:. 2 de Woe San eieaak Cpl ee A cates URUDAD ee: 355 
9. Elements of Quality Assurance and Acceptance 
Sampling 
9.1. Acceptance. Sampling by Attributes... -..........0.-.0000. 360 
9.1.1. The Operating Characteristic (OC) Curves vega ease 361 
9.1.2. The Success Run............ Resi waren Gs pete Sas 366 
9.1.3. The Average Outgoing Quality Curve.............. 367 
9.2. Acceptance Sampling by Variables........ ere eh ate 369 
9.2.1. Average Quality Criterion, e Known............... 369 © 
9.2.2. Average Quality Criterion, e Unknown.............. 372 
9.2.3. Fraction Defective Criterion. ....:. aah tele’ Vnus 1372 
9.3. Multiple-Stage Sampling... oneee taren 4 FEED T PUN y es. 375 


endix A. Probability Tables 


tandard Normal Probability ra OAA SAER 
ka alues of the é-Distribution.....--- De 
glues of the x*-Distribution. ........-- 


App 


Table A.J. 
Table A.2. p-Percentile M 

ble À.3. p-Percentile ; 
TE A.4. Critical Values of D* in the Kol 


Appendix B. Combinatorial Formulas 


Derivation of the Poisson 


Appendix © Distribution 


References 


Index 


Imogorov-Smirnov Test. 


1. Role of Probability in- 


Engineering 


11. INTRODUCTION 


Quantitative methods of modeling, analysis, and evaluation are the 
tools of modern engineering. Some of these methods have become quite 
elaborate and include sophisticated mathematical modeling and analysis, 
computer simulation, and optimization techniques. However, irrespective i 
of the level of sophistication in the models, including experimental labora- 
tory models, they are predicated on ‘idealized assumptions or conditions; . 
hence, information derived from these quantitative models may or may 
not reflect reality closely. i 

In the development of engineering designs, decisions are often required - 
irrespective of the state of completeness and quality of information, and 
thus must be formulated under conditions of uncertainty, in the sense that 
the consequence of à given decision cannot be determined with complete 
confidence. Aside from the fact that information must often be inferred 
from similar (or even different) cireumstances or derived through modeling, 
and thus may be in various degrees of imperfection, many problems in 
engineering involve natural processes and phenomena that are inherently 
random; the states of such rhenomena are naturally indeterminate and 
thus eannot be described with definiteness. For these reasons, decisions 
required in the process of engineering planning and design invariably 
must be made, and are made, under conditions of uncertainty. 

. The effects of such uncertainty on design and planning are important, 
to be sure; however, the quantification of such uncertainty, and evaluation 
of its effects on the performance and design of an engineering system, 
properly, should include concepts and methods of probability. Furthermore, 

. under conditions of uncertainty the design and planning of engineering 
Systems involve risks, and the formulation of related decisions requires , 
risk-benefit trade-offs, all of which are properly within the province of. 
applied probability. . 

In this light, we see that the role of probability is quite pervasive in 

i engineering ; it ranges from the description of information to the develop- 
ment of bases for design and decision making. Specific examples of such 


1 


2 ROLE OF PROBABILITY IN ENGINEERING 


Total Number Of Observations = 29 


Number Of Observations 


Sa 4245 So Ba Be RE RE 
J Annuai Rainfall Intensity, in, 


(a) In Number Of Observotions 


Fraction Of Observations, ^ 


Annual Raintall Intensity , in. 


(b) In Fraction Of Totol Observations 


% Per In. 


Annual Rainfall Intensity, in. 
(c) 


i 1.1. Histogram of. rainfall intensity (Esopus Creek Watershed, NY 
T. 1946) (a) In number of observations. (b) In fraction of total observations 
requency diagram of rainfall intensity (Esopus Creek Watershed, New York) 


? 


` L2. UNCERTAINTY IN REAL-WORLD INFORMATION 3 


I7] Tests On 3/8 in t0 
I-L/4 in. Dicmater Bore 


Fraction Of Tests 
Frequency, percent 


9o 50 60 70 
Ultimate Shear Strength, ksi 


7a 


Yield Strength, xsi 


Figure. 1.2 Histogram of yield Figure 1.3 Histogram of ultimate shear 
strength of intermediate grade rein- strength of fillet welds in structural con- 
forcing bars; data from Julian (1957) nections; after Kulak (1972) 


information and engineering design and decision-making problems are 
described in the following sections. 


1.2. UNCERTAINTY IN REAL-WORLD INFORMATION 


1.2.1. Uncertainty associated with randomness 


Many phenomena or processes of concern to engineers contain random- 
ness; that is, the actual outcomes are (to some degree) unpredictable. Such: 
phenomena are characterized by experimental observations that are 
invariably different from one experiment to another (even if performed 
under apparently identical conditions). In other words, there is usually 
2 range of measured or observed values; moreover, within this range certain 
values may occur more frequently than others. The characteristics of such 
experimental data can be portrayed graphically in the form of a histogram 
or frequency diagram, such as those shown in Figs. 1.1 through 1.17, which 
represent information on physical phenomena of significance in.engineering. 
(In some of these figures, specifically Figs. 1.5, 1.6, 1.7, 1.10, 1.13 1.14, and 
1.17, theoretical probability density functions are also shown; the significance 
of these theoretical functions and their relations to the experimental 
frequency diagrams are discussed in Chapters 3 and 6). : 

A: large number of physical. phenomena are represented in Figs. Ll. 
through 1.17; these are collected here purposely to demonstrate the fact 
that the natural state of most engineering information contains significant 
variability. 

The histogram, therefore, is a graphie empirical description of the: 
variability of experimental information. For.a specific set of experiniental 
data, the corresponding histogram may be constructed as follows. 

From the observed range of experimental measurements, choose a 


Southern Pine Lom Stock 
[D Grade 


Douglas Fir Lam Stock 
30 MEE 


Frequency Of Occurrence, % 
o 


O lO l4 18 22 28 30 34 


Modulus Of Elasticity, psi x 10-° 


of elasticity of 2.6 lumber; after 
Galligan and Snodgrass (1970) 


range on the abscissa. 


Figure 1.4 Histogram of „modulus Figure 1.5 


largest and smallest observed values, 
intervals. Then count the number 


=- era 


107 


Q4 


Lognormol i x "26.75 
= 19.0 


60 ?5 


x 10* Cycles 


. Frequency diagram of fatigue 


lives of: 75 S-T aluminum; after Pugsley 
(1955) 


(for a two-dimensional graph) sufficient to include the | 
and divide this range into convenient 


` vations ig the respective intervals. A. 


of observations within each interval, 
ts representing the number of obser- 
Iternatively, the heights of the bars may 


and draw vertical bars with heigh: 


“be expressed in terms of the fraction: 


to, of Crests 
per ft of Height 


Height Above Mean Lest, fn 
Figure 1.6- Frequency diagram of wave 
heights above mean sea level ; after Cart- 
wright and Longuet-Higgins (1956) 


s of the total number of observations in 


Stress, X, ksi ` 
Figure 1.7 Frequency diagram of 
midship bending stress from one typi- 
cal record “S.S. Wolverine State’ ^; after 
Hoffman and Lewis (1969) 


oe 


Table 1.1. 
Year Rainfall intensity (in.) 
1918 43.30 
i919 53.02 
1920 ' 63.52 
1921 45.93 
1922 48.26 
1923 50.51 
1924 49.57 
1925 43.93 
1926 ` 46.77 
1927 59.12 
1928 54.49 
1929 47.38 
1930 40.78: 
1931 45.05 
1932 50.37 
1933 54.91 
1934 51.28 
1935 39.91 
1936 53.29 
1937 67.59 
1938 ` 58.71 
1939 42.96 
1940 - 55.77 
1941 41.81. 
1942 58.83 
1943 48.21 
, 1944 44.07 
1945 07.72 
^ 1946 43.11 


each interval. For example, consider the annual Lica Seep 
intensity in the watershed area of the Esopus Creek in New York, dae Pus 
between the years 1918 to 1946, as presented in Table 14. An exa eas 
of these data will reveal that the observed rainfall intensity ye 2 
39.91 to 67.72 in. Choosing a uniform interval of 4 in. between 3 MERE 
in., the number of observations (and corresponding fraction of 

observations) within each interval are shown in Table 12. HUN 

Plotting the number of observations in a given rainfa xs e Yn a w 

obtain the histogram of the rainfall. intensity in the Esopus Cree wa e $ 


Table 1.2. 


Number of Fraction of total 
- Interval . observations observations 


38-42 
42-46 
46-50 
50-54 
54-58 
58-62: 
- 62-66 
66-70 


0.1034 
0.2415 
0.1724 
0.1724 
0.1034 
0.1034 
0.0345 ` 
0.0690 


1.0000 


| po co oo ov ov ed en 


as shown in Fig: Lia where; in i 
ho 1 Fig. Lila, as, in terms of the fraction of toi - 
iie the same histogram would be as shown in Fig. 1.1b =e 
or the purpose of comparing an empirical frequency distribution 


(as, for example, portrayed in a histogram) with a theoretical probability. 


.density function, the corresponding frequency dia. i i i 

ti : agram is required. 

mey be obtained from the histogram by simply dividing the Ae de 
Mind by its total area. In the case of the histogram of Fig. 1.1, we 

= ; E corresponding frequency diagram by dividing the ordinates in 

d la.by 29 X 4 = 116; or alternatively, by dividing the ordinates in 
ig. 1.15 by 4 X 1 = 4. The result would be as shown in Fig. Lic, which 


is the frequency di B i i i 
eens y diagram for the rainfall intensity of the Esopus Creek 


-4 -3 -2 -1 0O 1 23 4° -4 -3 -2 -i 0 1 2 34 


Number Of Standard Devlation Ni 
lumber Of Stondord Devati: 
(a) Typhoon Ruby 16/7/1870 7SW-3 {b) Typhoon Georgia 13/9/1970 7NE4 


Figure 1.8 Relative dispersions of mea. i i i i tall 
a ze dispers sured pressure flucti s ildi 
during typhoons; after Lam Put (1971) n iun aux ESOS 


29, 270 

Distribution Ot Lergeat " 

Instantaneous Peoh Volver H 
24 CH t 
E DisteibuNon OF d s 
$ naf Al values 3 mi s 
H i ! A 
Z os METTE t 
E voor JO? T. 5 
3 / \ Rocard. H 2 
$ o | Nose, § 

lo 
9354 3 91 $ 3 
Slresses le Stondord Devlatians From The Mean 
J (Taft Earihquake} Morlmum Laborotory Density, % 


Figure 1.9 Relative dispersion of earth- Figure 1.10 Histogram of density of 
quake-induced shear stresses in soils; compacted voleanic tuff subgrade; 
after Donovan (1972) after Pettitt (1967) 


The histogram, or frequency diagram, gives a graphic picture of the 
relative frequencies of the various observations, or measurements. For most 
engineering purposes, certain aggregate quantities from the set of obser- 
vations are more useful than the complete histogram; these include, in 
particular, the mean-value (or average) and & measure of dispersion. Such 
quantities may be evaluated from a given histogram; statistically, however, 
these are usually obtained in terms of the sample mean and sample standard 
deviation, as described in Chapter 5. 

Clearly, if recorded data of a variable exhibit scatter or dispersion, such 
as those illustrated in Figs. 1.1 through 1.17, the value of the variable 
cannot be predicted with certainty. Such a variable is known as a random 
variable, and its value (or range of values) can be predicted only with an 
associated probability. 

When two (or more) random variables are involved, the characteristics 
of one variable may depend on the value of the other variable (or variables). 


0.6 o6 

$04 $04 
H E 
H E 

È o2 ù o2 

o o 

65 TO 75 8&0 1S 20 25 30 ` 
DO Deficit, mg/liter DO Deficit, mg/liter 


Figure 1 ET. Histograms of dissolved oxygen (DO) deficit in Ohio River; after 
Kothandaraman and Ewing (1969) 


Werk Trips -25,193 
Vehicle Hours- 4,017 
Avg, Tilp Leagth- 9,57 


Somple Size = 807 


Meana 860 ft 


Total Work Trips, *& 


Frequency, % 


Travel Time, min. 
raval Time, min. Distence, ft 


Figure 112 O-D trip length frequency Figure 1.13 Frequency distribution 
distribution, Sioux Falls, S.- D., 1956; of walking distances for off-street 
after U.S. Department of Commerce parkers (all trip purposes) ; after Kana~ 
(1965) fani (1972) ae 


Gaps On Arroyo Seco Freeway 


zoot} . Lane 1, 2:0010 230 P.M, 
October, 1950 o9 
180| 
omi 
e los 
FEL 
H o7 $ 
Š u 2 
x Ss 
z os $ 
* Ie - 
Ei 
2 : los 
ie 100| li &-Observad Frequencies H 
EH ]- Theoretical A 
$ sol à Frequencies o^ = 60) 
&ol Dashed curve oppllesfo.3 H s 
only fo probebility 
stale. i * « 
40) 0.2 E xj 
H 
20 p Hl 
10] 
9, o! 
Length Of Gop t, sec. : LE Me Ba "i EC 


Figure 1.14 Histogram of gap length Figure 1.15 Estimated impact: speed ` 


ig cars on freeway; after Ger- in 5237 passenger-car. single-vehicle 
ough (1955) accidents; after Viner (1972) 


MAS NARAAS CARL R RIN DALAIT VAINAS AINE URE AS OIN 


All Sites 


Total No of Obsarvotions * 36 


Speculotive Development 
Lr f, Aversge 1080 Man Hews 
so w 1500 2000 


Locol Authority Troditional 


Jweroge 1200 Man Hours 
so wo 1500 200 


No. of Observoliont 


Locol Authority System 


nm. Average I07O Man Hours. 5 |, M 
So i NE NN BM Rollo 


Figure 1.16 Histograms of com- Figure 1.17 Highway bid distribution— 
pletion time for house building in southwest Texas; after Cox (1969) 


England; after Forbes (1969) 


Pairs of observed data for the two variables, when plotted on a two- 
dimensional space, would appear as in Figs. 1.18 through 1.22, which are 
- characterized by scatter or dispersion in the data points, called scattergrams. 
In view of such scatter, the value of one variable, given that of the other, 


. cannot be predicted with certainty. The degree of predictability will 


depend on the degree of mutual.dependency or correlation between the 
variables, as measured (in the linear case) by the statistical correlation. 


MOE = 13,171.50 S.R. + 639,000 
r= O85 


Modulus Of Elasticity , 10* psi 


Strength Rotio, Fe 


Figure 1.18 Linear regression of modulus of elasticity on strength ratio for f 
2 X 10 air-dry (16% m.c.) douglas-fir; after Littleford (1967) 


wot Sn o-cevaeciiskau&kkOR UN DAG EEHING 


59 
120, bod 
; E di 
5 É >~ 
d * oat ~~ e- 
= 215 
i s 
2 HE: 
i £ 
i HEE 
z $ 
El 
H 18 Ni (250) : 
3 des Thi Log N = 10870 ~ 3.372 Log Sp 
H i». Thich Specimens Bebo: | 
[3 
io 7300 20 -200 S -0 -9 0 t0 Ko el < a5 "16 5 D 
Temperature ,*F. k + Cycles To Failure, 10° 


Figure 1.19 Plane-strain fracture-tough- Figure 1.20 Relationship betwéen 
ness behavior of 18 Ni (250) maraging steel ^ stress range and fatigue life of welded 


as a function of temperature; after Barsom beams; after Fisher et al. (1970) 
(1971) g i 


Methods for evaluating such correlations are also embodied in Statistical 
analysis. 

It should be strongly emphasized that the application of probability is 
not limited to the description of experimental data, or to the evaluation 
of the associated statistics (such as the mean, standard deviation, and 
correlation). Indeed, the much more significant role of probability concepts 
is in the utilization of this information in the formulation of proper bases 
for decision making and design. In other words, when we are dealing with 
information such as that illustrated in Figs. 1.1 through 1.22, which 
requires probabilistic description, the proper utilization of this information 
in engineering design and planning will necessarily require concepts and 
methods of probability. For example, if a design equation involves random 
variables, such as those. described in Figs. 1.1 through 1.22, the quantitative 
analysis of their effects on, and the formulation of, the design will neces- 
sarily involve probabilistic concepts. 


1.2.2. Uncertainty. associated with imperfect modeling and 
estimation : 


Engineering uncertainty, however, is not limited purely to the varia- 


bility observed in the basic variables. First, the estimated values of a given 
variable (such as the mean) based on observational data will not be error- 
free (especially when data are limited). In fact, in some cases, such estimates 
may not be much: better than “educated guesses," based largely on the 
engineer's judgement. Second, the mathematical or simulation models 


an 


1.8. DESIGN AND DECISION MAKING UNDE UNUEIE ALY? X 


(for example, formulas, equations, algorithms, d preces 
boratory models, that are often used i ring 
Hin Ee) 5 i ideali tations of reality; 
i ealized representa: 
analysis and to develop designs are 1de m Pee 
i rfect representations o! 
in various degrees, such models are impe e caes 
icti d/or calculations made on 
world. Consequently, predictions ani c 2 db 
i to some unknown degree) an 
hese models may be inaccurate ( wn j ah 
i contain uncertainty. In certain cases, the uncertainties ae Ta 
such prediction or model errors may be much more significan 
: É A ‘abilities. 
associated with the inherent variabili i "n nat 
M es whether they. are asoca en Aee n E Ms 
icti in statistic , 
‘or with prediction error, may be assessed t Maat gine 
i ir signi ing design accomplis 
luation of their significance on engineer sh 
onsets and methods that are embodied in the theory of probability. 


1.3. DESIGN AND DECISION MAKING UNDER UNCERTAINTY 


bi ae A: ^ 17, in 
i ion i e' illustrated in Figs. 1.1 through 1. l, 
E e ERE epresentative, and evaluations and pre- 


i o single observation is r n s 
nae mu be based on imperfect models, how should designs be for: 


$ 


Mean Discharge, cfs 


Land Value, $1000 per acre (1970 dollars) 
5- : 


Drainage Area, sq, mile Population Per Acre 


igur s ion line’ for 
Figure 1.21 . Relation of mean annual dis- Figure 1.22 Par iod 
charge to.drainage area for streams near. interstate highway aeq ; 


Honolulu; after Todd and Meyer (1971) after Ward (1970) : " 


-=> ~v + await I N ENGINEERING 


mulated or decisions affecting & design be resolved? Presumably we may 
assume consistently worst conditions (for example, Specify the highest 


possible flood, smallest observed fatigue life of materials, and so on) and. 


develop conservative designs on this basis. From the standpoint of System 
performance and safety, this &pproach may be suitable; however, the 
resulting design could be too costly as a consequence of "compounded 
conservatism.” On the other hand, an inexpensive design may not ensure 


. the desired level of performance or safety. Therefore, decisions based on 


" 


trade-off between cost and benefit (including tangible and intangible 
factors) are necessary. The most desirable solytion is one that is optimal, 
in the sense of minimum cost and/or maximum. benefits. If the available 
information and evaluative models contain uncertainties, the required 
trade-off analysis should include the effects of such uncertainties on 4 given 
decision. 

Such situations are common to many problems of engineering design 
and planning; in this section we describe several examples illustrating 
some of these problems. The examples are idealized to simplify the dis- 
cussions; nevertheless, they serve to illustrate the essence of the decision 
making aspects of engineering under conditions. of uncertainty. 


1.3.1. Planning and design of airport pavement 


Consider, as the first example, the design of an airport pavement. 
Among the many factors that have bearing on the design, the thickness of 
the pavement system (consisting of several layers of subgrade base material 
and the finished pavement) is one of the principal decision variables. 
In general, the usable life of the pavement will depend on the thickness 
of the system; the thicker the. pavement system is, the longer its useful 


` life will be. Of course, for the same material and workmanship quality, the 


cost will also increase with the thickness. On the other hand, a thin system 
will cost less initially, but the subsequent maintenance and réplacement 
costs will be higher. Therefore the thickness of the pavement system may 
be determined on the basis of a trade-off between, high initial cost with 
low maintenance, versus low initial cost but high replacement ‘and main- 
tenance costs, For the purpose of such trade-off analysis, the relation 
between the life of a pavement system and its thickness is required. How- 
ever, the pavement life is also a function of other variables, "including 
drainage and moisture content, temperature ranges, density and degree- 
of compaction of the subgrade. Since these factors are random (see Fig. . 
1.10 for the subgrade density), the life of the pavement cannot be predicted 
with certainty. Hence, the total cost (including initial and maintenance i 
costs) associated with a given pavement thickness cannot be estimated 
with complete confidence; any meaningful trade-off analysis, therefore, 
would properly require concepts of probability. ` 


13 
1.3. DESIGN AND DECISION MAKING UNDER UNCERTAINTY ; 


1.3.2. Hydrologic design 
Suppose that the protection of a large farm from flooding ee 
construction of a main culvert at the junction of a roadway an fe puer . 
A decision on the size (flow capacity) of the culvert is ae d; c n ud 
the size will depend on the high strearn flow, which is a SA bas 
rainfall intensity in the watershed and the associated runoff. pm rt 
is large enough to handle the largest possible fow, me yw Ah 
danger to flooding; however, the cost of constructing t os bi y 
bé prohibitively high, and even during the heaviest rainfal l ne cu is 
may be used only to a fraction of its capacity—that is, overdesign w " 
be wasteful and costly. On the other hand, if the culvert is too ais g i 
'eost may be low, but the farm is likely to be subject to ae 2 ce 
every time there is a heavy rainfall, causing damage to crops and e 
1 m. soil. . ; t 
T end bia would properly require probability Parnas es 
following reasons. First, the annual rainfall intensity is highly us ed 
(as illustrated in Fig. 1.1), and the prediction of the stream icis may 
be imperfect; consequently, the maximum stream flow (for reas ehe 
year) cannot be predicted with certainty. Assuming that the qu mie. 
of a given culvert size can be determined accurately, the size of the. pp . 
would depend on the probability of ae Tina gor ao gor 
example, ten years). The culvert size then may r T 
total expected cost, consisting of the initial cost, of construc Eih 0 
h ed loss from flooding and erosion, is minimized. The expe 
iwi nen of the probability of flooding, and hence the definition 
of the total expected cost requires probability measures. 


1.3.3. Design of structures and machines 

In Fig. 1.8 we have an example showing that magnitudes of load Ud 
ressures in the case of Fig. 1.8) may be described with random n r 

herens in Figs. 1.2 and 1.3 the strengths of structural material an: 


components are shown to be also random, and thus the resulting structural 


resistance will likewise be random, Even for this n yet [d 
i t inati f how strong it should be) must 
sign of a structure (determination o: À i 
Rei the question “How safe is safe. en = à guenon that theo 
retically requires the consideration of risk or probability of a ne iin 
To be specific, consider for example the design of an offs! ore dr l : 
tower, which is subject to occasional hurricane forces. In such a ae we 
recognize that aside from the fact that ies wind der : inna 
i f hurricanes in-a giv 
a hurricane is random, the occurrence ol arr ] b 
ion i i in determining the safety level for the 
region is also unpredictable. Hence, in di c 
Peel of thé tower, the probability of occurrence of strong hurricanes 


` protection in terms of risk or failure probal 
. Structure. 20 j 
In structural or machine components. that: até ‘subject to repeated or 


"m RULE UF PRUBABILIVY IN ENGINEERING 


within the specified useful life of the structure must be considered, in 
addition to the survival probability of th tructure;during a hurricane. 
The’higher the hurricane force gets, the le: quent’ will be its occurrence; 
therefore, if a very strong hurricane i ëd: for the design, there may 
be almost no chance of its occurring. a 1 
tower. Consequently, what level of hun 
design, and what level of protection woul 
are decisions that clearly require trade-offs 


dequate during a hurricane, 
‘Be n cost and level of 
Within the lifetime of the 


cyclic loads, the fatigue life (that is, the number of lóad cycles until fatigue 
failure or fracture) of the component is also random, even at constant- 
amplitude stress cycle as illustrated in Fig. 1.5. For this reason, the useful 
life of the component is, to some degree, unpredictable. A design will 


depend.on the required life and desired level of reliability; for a given ` 


design, the shorter is the required service life, the higher will be its reli- 
ability against possible breakdowns within the specified service life. Fatigue 
life is also a function of the applied stress level; generally, the higher the 
stress, the shorter the fatigue life. Consequently, if a desired life is specified, 
the components could be designed to be massive so that the maximum 


_ tresses will be low and thus assure long life. This appraoch will, of course, ` 


be expensive in terms lof material. In contrast, if the parts are under- 
designed, high stresses may ‘be induced, resulting in short life and thus 
requiring frequent replacements. 

The optimal life may be determined on the basis of minimizing the total 
expected cost, which would include the initial cost, the expected cost of 
replacement (a function of the reliability or probability of no failure), and 
the expected cost associated with the loss of revenue incurred during a 


repair (also a function of reliability). Having decided on the desired design’ 


life,-the components may then be proportioned actordingly. 


1.8.4. Geotechnical design 


Properties of soil material are inherently heterogeneous, and natural 
earth deposits are characterized by irregular layers of various material 
(for example, clay, silt, sand, gravel, or' combinations thereof) with 
wide ranges of density, moisture content, and other properties that affect 
the strength and compressibility of the deposit. On the other: hand, rock 
formations are often characterized by irregular systems of geologic faults 
and fissures that significantly affect the load-bearing capacity of the rock. 

In designing foundations to support structures and facilities, the capacity 
of the £n situ subsoil and/or rock deposit must be determined: Invariably, 
however, this determination has to be made on the basis of available geo- 


1.4. DESIGN AND DECISION MARBIIYU. UIDEN ULIYGEAINA AMA E "m 


logie information and data from site exploration with limited soil-sampling 
results. : : : ; ` 
Because of the natural heterogeneity and irregularity of soil and rock 
deposits, the capacity of the subsoil could vary widely over & foundation 
site; moreover, because the required load-bearing capacity must invariably 
be estimated on the basis of very limited information, such estimates are 
subject to considerable uncertainty. As a consequence, an estimate may 
run some risk of overestimating the actual capacity of the Soil deposit 
at a site; in view of this fact, the safety of a foundation designed on the 
basis of such estimates may not be assured with complete confidence, 
unless à sufficient margin of safety is provided. On the other hand, an — 
excessively large safety margin may yield an unnecessarily costly. support 
system. Therefore the required safety margin for design may be viewed as 


. & problem involving the trade-off between cost and tolerable risk or 


probability of failure. 


1.3.5. Construction planning and management 

Many factors in the construction industry are subject to variability and 
uncertainty, sóme of which cannot be controlled. For example, the required 
durations of various activities in & construction projeet will depend on the 
availability of resources, ineluding labor and equipment and their respec- 
tive productivity, on weather conditions, and on availability of material. 
None of these factors are completely predictable, and thus the,durations of 
the individual activities as well as the project duration cannot be estimated : 
with much precision or certainty (see for example Fig. 1.16); such durations 
must be described as random variables. Therefore, in: preparing a bid 
for a project, if conservative (or pessimistic): time estimates ‘are used, 
the bid price may be too high, thus reducing the chances of winning the 
bid. On the other hand, if the bid is based on an optimistic estimate of . 
the project time, the contractor may lose money in à süccessful bid. What 
degree of conservativeness should the contractor exercise to maximize his 
profit potential? Realistically, this decision may be based ona consideration 
of probability—the bid price may be based on a target time corresponding 
toa specified probability of corapletion. 


1.8.6. Photogrammetric, geodetic, and surveying measurements 


All practical measurements are subject ‘to ‘errors, which can be ‘classified 
as random and systematic errors. Systematic errors can be eliminated or 
: minimized by evaluating them and applying corrections. However, the. 
~ magnitude and propagation of random errors;'inherent in making measure- 
ments. can be established and analyzed only on the basis of probability 
theory. Such a statistical approach is indeéd the only reliable means for 


16 ROLE OF PROBABILITY IN ENGINEERING — 


determining accuracy once measurements are refined beyond instrument 
capabilities. 

The accuracy and precision of measurements can be'improved by using 
instruments capable of keener measurements and by adopting more refined 
observational procedures. Depending on.the importance and cost of a 
project, the additional cost of increased accuracy and precision may or 
may not be justified. 

The method of least-squares is used widely in photogrammetry, geodesy, 
and surveying; it is used, for example, in the adjustment of photogram- 
metric blocks, geodefic networks, and leveling circuits. A priori estimation 
of the accuracy of geodetic coordinates, -for example, is essential before 
‘finalizing the selection of the configuration of triangulation and trilateration 
projects. - 

In conjunction with photogrammetrically produced digital terrain 
models, least-square fitting using polynomials, potential functions, or 
trigonometric functions is often used to mathematically describe the sur- 
face of the object under study. The object may be a terrain under considera- 
tion, such as a possible airport site, an animal for which the surface. area is 
to be determined, or. a trileaflet heart valve under study to develop pros- 
thetic valves. . 

-In remote sensing, statistics is used extensively in pattern recognition 
techniques where the objective is to classify the image spectrally. Sample 
spectral data from the scene is statistically clustered into distinct groups. 
'fhis grouping is then often extended to the entire image through the 
application of discriminate function analysis. 


1.4. CONTROL AND STANDARDS 


In order to assure some minimum level of quality, or performance, of 
engineering produets or systems, inspections and standards of acceptance 
are necessary. Clearly, if the standard is too stringent, it may unnecessarily 


‘increase product cost or its adherence and enforcement may be difficult; 


on the other hand, if the standard is too lax, the quality of the product may 

` be overly compromised. Also, if the control variables or design variables 
are random, what constitutes a stringent, or nonstringent standard is not 
immediately clear; in‘ these cases, the acceptance standards ought to be 
developed also on the basis of probability considerations. 

For example; in constructing. an earth embankment, practical stand- 
ards for acceptability of the compaction should recognize the variability in 
the density of compacted material, as illustrated in Fig. 1.10. Acordingly, 
an acceptance sampling plan may be developed . based on probability, 
considerations and taking into ‘account, such variability. 


1.5. CONCLUDING REMARKS 17 


To control the quality of ricer the Saisir most commonly used as 
a measure of pollution is the concentration of dissolved oxygen (DO) in the 
stream water, which is random as illustrated in Fig. 1.11. Among en- 
vironmentalists there is a growing realization of the need for probabilistic 
standards of stream quality (for example, Loucks and Lynn, 1966; Thayer, 
1966). Loucks and Lynn (1966) proposed the following as an example 
probabilistic stream standard: 


The dissolved ozygen concentration in the stream during any 7 consecutive 
day period must be'such that: (i) the probability. of its being less than . 
4 mg/l for any one day is less than 0.20; and (ii) the probability of its 
being less than 2 mg/l for any one day is less than 0.1 and for two or more ` 
consecutive days is less than 0.08. 


To ensure the quality of concrete material in reinforced concrete con- 
struction, the Building Code of the American Concrete Institute (ACI 
318-71) requires the following. 


The strength level of the concrete will be considered satisfactory 1f the aver- 
ages of all-sets of three consecutive strength test results equal or exceed the - 
required f? and no individual strength test result falls below the required 
fé by more than 500 psi. Each strength test result shall be the average of 
two cylinders from the same sample tested at 28 days or the specified earlier ` 


age. 


'These statements imply the need for, probability and statisties in the 
assurance of quality concrete. 


1.5 CONCLUDING REMARKS 


This chapter has emphasized the importance and role of probability 
concepts and méthods in engineering. The examples enumerated and 
described in Sections 1.2 to 1.4 should: serve to emphasize the pervasiveness 
of such concepts in engineering planning and design. In particular, it should 
be stressed that the description of statistical information and estimation 
of statistics, such as means and variances, are not the only applications of 
probability theory; the much more significant role of probability concepts, 
in fact, lies in its unified framework for the quantitative analysis of un- 
certainty and assessment of associated risk, and in the formulation of 
trade-off studies relative to decision making, planning, and design. 

The many examples presented also serve to illustrate, with real data 
and realistic engineering problems, that randomness of real-world phen- 
omena and imperfections of engineering predictions and estimations are 


ao KULE OF PROBABILITY IN ENGINEERING 


facts of life. Consequently, uncertainties associated with such randomness 
and imperfections are unavoidable in engineering planning and design. 

‘Finally, it is important to allay any misconception ‘that extensive data 
are required to apply probability concepts; the usefulness and relevance of 
such concepts are equally significant, irrespective of the amount of data 
or quality of information. Probability is the conceptual and theoretical 
basis for modeling and analyzing uncertainty. The availability of data and 
quality of information will affect the degree of uncertainty; however, the 
lack of sufficient data should not lessen the usefulness of probability as the 
proper tool for the analysis of such uncertainty and for thé evaluation of 
its effects on engineering performance and design. 

In the ensuing chapters of this volume, as well as those in Volume i, 
the probabilistic concepts and methods necessary for these purposes are 
developed. : ` Í : 


2. Basic Probability Concepts _ 


2.4. EVENTS AND PROBABILITY 


2.1.1. Characteristics of probability problems 


It may be recognized from the discussions in Chapter 1 that when we speak 
of probability, we are referring to the occurrence of an event relative to 
other events; in other words, there is (implicitly at least) more than one 
possibility, since otherwise the problem would be deterministic. For quan- 
titative purposes, therefore, probability can be considered as a numerical 
measure of the likelihood of occurrence of an event relative to a set of 
alternative events. f 

Accordingly, the first requirement in the formulation of a probabilistic 
problem is the identification of the set of all possibilities (that is, the possi- 


bility space) and the event of interest. Probabilities.are then associated’ 


with specific events within a particular possibility space. 
To illustrate the various aspects of a probabilistic problem, as char- 


_ acterized above, consider the following examples. 


EXAMPLE 2.1 


A contractor is planning the purchase of equipment, including bulldozeis,: 


needed for a new project in a remote area. Suppose that from his previous experience, 
he figures there is a 50% chance that each bulldozer can Jast at least 6 months 
without any breakdown. If he purchased 3 bulldozers, what is the probability that 


_ there will be only 1 bulldozer left operative in 6 months? 


First we observe that at the end of 6 months, the number of operating bulldozers 
may be 0, 1, 2, or 3; therefore this set of numbers constitutes the possibility space 
of the number of operational bulldozers after 6 months. However, the probability 


of the various possible outcomes cannot be readily determined from the information . 


that each bulldozer has a 50% chance of remaining operative after 6 months. For 
this purpose, the possibility space must be derived in terms of the possible status of 
each bulldozer after 6 months, as follows. 

If we denote the condition of each bulldozer after 6 months as G for good and 
B for bad conditions, the possible statuses of the three bulldozers would be 


GGG—all three bulldozers in good condition 
GGB—first and second bulldozers good, and third one bad 


' 19 


e 


20 -BASIC PROBABILITY CONCEPTS ` 


GBB 
BBB—all three bulldozers in bad condition 
BGG 
BBG 
GBG 
BGB 


In this case, therefore, there are a total of 8 possibilities. Since the condition of a 
bulldozer is equally likely to be good or bad, the 8 possible statuses of the 3 bull- 
dozers are also equally likely to occur. It is worth noting that among the 8 possible 
outcomes, only one of them can be realized at the end of 6 months; this means that 
the different possibilities are mutually exclusive (we shall say more on this point in 
. Section 2.2.2). . . - 

Among the 8 possible statuses of the 3 bulldozers, the realization of GBB, BGB, 
or BBG is tantamount to the event “only one bulldozer is operational.” And since 
each possibility is equally likely to occur, the probability of the event within the 


above possibility space is 3/8. 


EXAMPLE 2.2 


In designing a left-turn lane for eastbound traffic at a highway intersection, as 
shown in Fig. E2.2, the probability of 5 or more cars waiting for left turns may be 
rieeded to determine the length of the left-turn lane. For this purpose, suppose that 


over a period of 2 months 60 observations were made (during periods of heavy - 


traffic) of the number of eastbound cars waiting for left turns at this intersection, 
with the following results. - £ B 


No.ofcars . No.of observations Relative frequency 


4 4/60 
16 16/60 - 
20 20/60 
14 . 14/60 

3 3/60 

2 2/60 

1 - 1/60 

0 0 

0 .0 


+ oann ANO 


Conceivably, or theoretically, the number of cars waiting for left turns, during 
heavy traffic hours, could be any positive integer number; however, in the light of 
the above traffic count, the possibility of 7 or more cars waiting for left turns is not 
likely to occur at this intersection. : 3 


On the basis of the foregoing results, the observed relative frequency (tabulated 
in the third column above) may be used as the probability of a particular number of 


2.1. EVENTS AND PROBABILITY 21 


Figure E2.2 


cars waiting for left turns. Then the robabilit of the event “5 iting” 
2/0 + 0 — 3/60. P y nt "5 or more cars waiting 


EXAMPLE 23 - fecu 

In the simply supported beam AB shown in Fig. E2.3, the load of 100 k z can be 
placed anywhere along the beam. In this case, de reaction at the MU Par x 
clearly can be any value bétwéen 0 and 100 kg; therefore any number between 0^ 
and 100 is a possible reaction value for R4, and thus is its possibility space. 

An event of interest then may be that the reaction is in some specified interval; 
for example, (10 < Ry € 20 kg) or (Ry > 50 kg). Therefore, if a particular value of 
R4 is realized, the event (defined by an interval) containing this value of R4 has 
occurred, and we can speak of the probability that R4 will, or will not, be in a given 
interval. For instance, if we assume that the 100-kg load is equally likely to be placed 
anywhere on the beam, then the probability that the value of R4 will be in a given 
interval is proportional to the interval; for example, P(10 < R4 < 20) = 19. = 


0.10, and P(Ry > 50) = Ey = 0.50. 7 Yes — 
1100 kg (kilograms) 
A B 
Rat .— IO m (meters) In, 
Figure E2.3 b ý 


From the foregoing examples, the following special characteristics of 
probabilistic problems may be observed. ` A 


1. Every problem is defined with reference to a specific possibility space 
(containing more than one possibility), and events are composed of 
one or more possible outcomes within this possibility space. 

2. The probability of an event depends on the probabilities of the in- 
dividual outcomes within a given possibility space. e 


In Sections 2.2 and 2.3, we shall present the mathematical tools pertinent 
to and useful for each of. these purposes. i 


mae ADADAN RORAXZRETARRRGGE R A UU RT 


2.1.2, Calculation of probability 


-From the examples discussed above, it can be observed that in calculating 


the probability of an event, a basis for assigning probability measures to 
the varioüs possible outcomes is necessary. The assignment may be based 
on prior eonditions (or deduced on the basis of prescribed assumptions), 
or based on the results of empirical observations, or both. $ 

In Examples 2.1 -and 2.8, the probabilities of the possible outcomes were 
based on prior assumptions. In the case of Example 2.1, each of the possible 
statuses of the 3 bulldozers was assumed to-be equiprobable, each equal to 
1 (consistent with the prior information that each bulldozer is equally 
likely to be operative or nonoperative after 6 months) ; whereas, in Example 
2.8 the probability that the reaction Ra will be in a given interval was 
assumed to be proportional to the interval length (consistent with the 
assumption that the position of the 100-kg load is equally likely to be any- 
where along the beam). However, in Example 2.2 the probability of the 
number of cars waiting for left turns is based on the corresponding observed 
relative frequency, which is determined from empirical observations. ` 

It should be emphasized that we shall treat probability as a measure 
necessary and useful in problems where more than one event or outcome 
is possible. In particular, we shall avoid the philosophical question of the 
meaning of a probability measure, and concern ourselves merely with the 
“utilitarian aspects of probability and its mathematical theory (see Section 
2.3) for mbdeling problems under conditions of uncertainty, in the same 
sense that we use the factor of safety to effect engincering design without 
worrying about its real meaning, or employing Newton’s Second law of 
motion without being concerned about the meaning of mass and force. 
-` The usefulness of a calculated probability, however, will depend on the 
appropriateness of the basis for its determination: In this regard, we observe 
that the validity of the a priori basis for caleulating probability depends on 
the reasonableness of the underlying assumptions, whereas the empirical 
‘relative frequency basis must rely on a large amount of observational data. 
When data are limited, the relative frequency by itself may have limited 
usefulness. i if 

A third basis for calculating probability involves the combination of 
intuitive or subjective assumptions with expcrimental observations; the 
proper vehicle for this combination is Bayes’ theorem (see Section 2.3.4), 
and the result is known as the Bayesian probability (see Chapter 8). 


2.2, ELEMENTS OF SET THEORY 


Many of the characteristics of a probabilistic problem can be defined 
formally and modeled succinctly using elementary notions of sets and the 


2.2, ELEMENTS OF SET THEORY — 23 


mathematical theory of probability: In this and the following section 
(Section 2.3), we present the basic elements of the theories of sets dnd 
probability, as they relate to and are'useful for the formulation of prob- 
abilistic problems. 


2.2.1. Definitions 


In the terminology of set theory, the set of all possibilities in a probabilistic 
problem is collectively a sample space, and each of the individual possi- 
bilities is a sample point. An event then is defined as a subset of the sample 
space. 3 S 
Sample spaces may be discrete or continuous. In the discrete case, the 
sample points are individually discrete entities and countable; in the 
continuous ease, the sample space is composed of a continuum of sample 
points. : ` 
A discrete sample space may be finite (that is, composed of a finite 
number of sample points) or iùfinite (that is, with a countably infinite 
number of sample points). The possible status of the three bulldozers in 
Example 2.1 is an example of a finite discrete sample space; each of the 
possible statuses is a sample point, and the eight possibilities collectively 
constitute the corresponding sample space. Other examples of finite sample 
Spaces are as follows. (1) The winner in a competitive bidding for a con- 
struction project will be among the firms submitting bids for the project. 
The sample space then consists of all the possible bid winners, which are 
the finite number of firms involved in the bidding; in this case, each of the 
firms is a sample point.: (2) The number of days in a year with freezing 
temperature in Juneau, Alaska, is limited to 365 days; each day of the year 
then is a sample point, and collectively all the days of the year constitute 
the sample space. Example 2.2 is an illustration of a discrete sample space 
with countably infinite number of sample points; the number of cars waiting 
for left turns could, theoreticaliy, be any integer number from zero to 
infinity. Other examples are the following: (1) the number. of flaws in a 
given length of weld, and (2) the number of cars crossing a toll bridge until 
the next accident on the bridge. In each case we have an infinite number 
of discrete possibilities. There may be none or only a few flaws in the weld, 
or the number of flaws could be very largo; similarly, an accident may oceur 
with the first ear crossing the toll bridge, or there may never be an accident 
on the bridge. tos 
Ina continuous sample space, the number of sample points is effectively 
always infinite. For example, (1) iñ considering the location on a toll 
bridge where a traffic accident may occur, each of the possible locations is 
a sample point, and thé sdmple space would be the continuum of points on 
bridge; and’ (2) if the bearing capacity of a clay deposit is between ` 
5 tsf (tons per sq it) to 4.0 tsf, then any value within the range 1.5 to 


24 BASIC PROBABILITY CONCEPTS 


4.0 is a sample point, and the entire continuum of values in this range 
constitutes the sample space. 

Whether a sample space is discrete or continuous, however, an event is 
always a subset of the appropriate sample space; therefore an event always 
contain one or more sample points (unless it is an impossible event), and 
the realization of any of these sample points constitutes the occurrence of the 
corresponding event. Finally, when we speak of probability, we are always 
réferring to an event within a particular sample space. i 

The following example clarifies the preceding notions in more definitive 
terms. ; g 


EXAMPLE 2.4 


Consider again a simply supported beam AB (Fig. E2.4a). x 

(a) If a concentrated load of 100 Ib can be placed only at any of the 2-ft interval 
points on the beam, the sample space of the reaction R4 will be as shown in Fig. 
E2.4b. In this case, the sample space of R , consists of distinct sample points. 

Let us also consider the sample space of R4 and Rpg (that is, all possible pairs of 
values of R4 and Rp); in this case, any pair of values of R, and Ry such that R, ud 
Rg = 100 belongs to the sample space, which is shown in Fig. E2.4c. 

(b) If the load can be placed anywhere along the beam, the sample space of Ry 


„cari be represented by the line between 0 and 100 (Fig. E2.4d), whereas the corre- 


sponding sample space of R, and Ry is the straight line shown in Fig. E2.4e. 


HOO Ib 


Apt B 
. : 20ft z oe e © © © © e a s o 
EE 5 E Rg O © 20 30 40 5 © 7 80.90 100 ib 


Figure E2.4a Figure E2.4b Sample space of Ra 


Re, 


100 


‘Sompte Space 
Of R, and Rp 


o 20 40 sr 100 


(o Woo Wolues OF Ra Somple Spoce Of Ra 


Figure E2.4c tow Figure E2.4d Sample space of Ra 


2,2. ELEMENTS OF SET THEORY 25 


All Point Pairs On 
. This Line »Sompla 
Space Of Ra ond Rg 


(40, 60) 


O, IO, 20, 30, 40, 50, 60, 70, 80, 90,100, 120 
" 140, 150, 160, 180; 200, 210, 240, 270, 300 Ib 
o 100 Ra 7 7 


Figure E2.4e Figure E2.4f Sample Space of R4 


We can then speak of the event that R4 will be between, say, 20 and 40; or that 
(R4, Ry) will be between (20, 80) and (40, 60). 

(c) Next consider that the load can be 100 Ib, 200 Ib, or 300 1b, and its position 
can be at any 2-ft interval on the beam. The sample space of R then contains the 
values listed in Fig. E2.4f, whereas the sample space of R4 and Rg is represented by 
the two-dimensional coordinates of the points shown in Fig. E247. 

However, if the load can be placed anywhere along the beam, then the sample 
space of R , and Rg would be described by the three lines shown in Fig. E2.4h. 

(d) If the load can be any value between 100 and 300 Ib, then the sample space of 
R4 contains all values between 0 and 300 Ib, as represented by the line in Fig. E2.4/, 
epa the sample space of R , and Rg would be the shaded area shown in Fig. 

Aj. , : $ 


200 
Sample Points In 
Sample Space Of 
Ra and Rg 


Sample Space Of R And Rg 


100! 


Ry © i 100 200 300 Ry 
Figure E2.4g ` : Figure E2.4h 


zo BADIU FRUBADILIL E GUIYGEE EO 


Sample Space 
Of Ra and Rg 


a ET 
[7 Y 300 Ib 


É igure E2.4i Sample space of Ra Figure E2.4j 


Special events. We define the following special events and adopt the 
notations indicated below. 


1. Impossible event, denoted 4, is the event with no sample point. It is 
therefore an empty set in a sample space. 


2. Certain event, denoted S, is the event containing all the sample points 


. jn a sample space; that is, it is the sample space itself. 


8. Complementary event E. For an event E in a sample space S, the com- | 


plementary event, denoted E, contains all the sample points in S that 
are notin E.: ' » 
The Venn diagram. : A sample space and the events within it can be 
represented pictorially with the Venn diagram—a sample space is repre- 


sented by a rectangle; an event E is then represented symbolically by a.. 


closed region within this rectangle, and the part of the rectangle outside 
this closed region is the corresponding complementary event E. See Fig. 2.1. 
In other words, the event E contains'all the sample points within the elosed 


Figure 2.1 A Venn diagram: 


2.2. ELEMENTS OF SET THEORY | Om 


Figure 2.2 Venn Diagram with several events. (a) Two events, A and B. (b) 
Three events, A, B, and C . 


‘region, whereas É contains all the sample points outside of E. A Venn 
diagram with two (or more) events would appear as in Fig. 2.2. 


2.2.2. Combination of events 


In many practical problems, the event of interest may be some combination 
of other events. For instance, in Example 2.1 the event of at least 2 bull- 
dozers operative after 6 months may be of interest. This can be considered as 
the combination of 2 bulldozers operative or 8 bulldozers operative. Such an 
event is the “union” of the two individual events. eg 
. There are two basic ways that events may be combined or derived from 
other events: by the union or intersection. Consider two events E and Es: 
The union of E, and E», denoted E, U Es, is another event that means the 
occurrence of Ey or E», or both. In other words, Ei U E, is the subset of 
sample points that belong to E, or Æ: (In set theory, or is used in an in- 
clusive sense, which means and/or). 


Examples:, (1) In describing the state of supply of construction material, 
if E, represents the shortage of concrete and E; represents the shortage of 
steel, then the union E; U E; is the shortage of concrete or steel, or both. 
(2) In a 20-mile length of an oil pipeline, if E; stands for leakage in mile 0 
to 15 and E; stands for leakage in mile 10 to 20, then E, U E» means leakage 
anywhere along the entire 20-mile pipeline. ` 


Figure 2.3 Venn diagram for union of events E, and E; ^ 


28 BASIC PROBABILITY CONCEPTS 


Figure 2.4 Venn diagram of intersection of events Ej and Es 


; The Venn diagram for the union of two events E; and Es would be the 
shaded region shown in Fig. 2.3. It follows, therefore, that the portion of 
the rectangle outside of the shaded region in Fig. 2.3 is the complementary 
event Ei U Ez; that is, the complement of E, U Fz. 

The union of 3 or more events means the occurrence of at least one of 
them. For example, transportation between Chieago and New York may 
be by air, highway, or railway. If we denote the availability of these modes 
of transport, respectively, as A, H, and R, the available means of trans- 
porting material between these two cities can be denoted as (A UH UR). 

The intersection of Ey and Fz, denoted E, n Es» (or simply EE), is also 
an event that means the joint occurrence of E, and E»; in other words, 
EE, is the subset of sample points belonging to both E; and E». 


Examples: Referring to the examples described above, (1) ZH: means 
the shortage of concrete and steel; (2) EE; means the leakage in mile 10 
to 15 along the pipeline; whereas, AHR means all three methods of trans- 
port between Chicago and New York are available. 


` In terms of the Venn diagram, the intersection of two events E, and Es 
would be as shown in Fig. 2.4. 


EXAMPLE 2.5 


In Example 2.2 the sample space is the set (0, 1, 2, 3, . . .}; that is, theoretically 
the sample space contains all non-negative integers. 
Jf E, = the event of more than two cars waiting for left turns; 


that is, the subset (3, 4, 5, . . .) 
and i 
E, = the event between two to four cars waiting for left turns; 


that is, the subset (2, 3, 4) 


then the union E, U E; is the subset (2, 3, 4,...} whereas the intersection E,E, 
is the subset (3, 4}. ! 


2.2, ELEMENTS OF SET EHEUICE 


Rg 


o 100 
Figure E2.6a Event A 


300 Ry O 100 


. Figure E2.6b Event B 


Rg 


300 


900 Ry 


Figure E2.6c Union AUB Figure E2.6d Intersection AB 


EXAMPLE 2.6 


Tn the last case of Example 2.4, where the load can range between 100 and 300 Ib, 
the sample space of the reactions R4 and Ry is shown in Fig. E2.4j. 
If A = event {R4 > 1001b) 
and : i a 
B = event (Rp > 100 Ib} 


p d 


$00 Ra 


the events 4 and B would be the subsets containing all point pairs of R4 and Rg 


shown in Figs. E2.6a and E2.6b, respectively. Observe that the events A and B 
are defined within the sample space of R, and Rz. Then the union A Y B contains 
all point pairs in the shaded region of Fig. E2.6c; whereas the intersection AB is 
the shaded region shown in Fig. E2.6d. In the present example, Figs. E2.6a through 
E2.6d serve also as the corresponding Venn diagrams. . 


30 BASIC.PROBABILITY CONCEPTS 


€» 


Figure 2.5. Mutually exclusive events Ej and Es 


S 


Mutually exclusive events. If the occurrence of one event precludes 
the occurrence of another event, then the two events are mutually exclusive; 


this means that the corresponding subsets will have no overlap, as shown. 


in the Venn diagram of Fig. 2.5. That is, the subsets aré “disjoint.” The 
intersection of two mutually exclusive events E; and Ey, therefore, is an 
impossible event; that is, EE: = ¢. Examples of mutually exclusive events 
are (1) making right turn and left turn at a street intersection; (2) flood 
and drought of a river at a given instant of time; (3) failure and survival 
of a structure to a strong motion earthquake. 

Three or more events are mutually exclusive if the occurrence of one 
precludes the occurrence of all others. For example, if there are three 
possible locations for a new airport, then the choices among the three sites 
are mutually exclusive. 


Collectively exhaustive events. Two or more events are collectively 
exhaustive if the union of all these events constitute the underlying sample 
space, 


EXAMPLE 2.7 


Two construction companies a and b are bidding for jobs. Let A denote the 
event that Company a gets a job and B denote the event.that Company b gets a job. 
Draw the Venn diagrams for the sample spaces of the following: 

(a) Company a is submitting a bid for one job and Company b is submitting a bid 
for another job. ' 

(b) Companies aand bare submitting bids to the same job, and there are more than 
2 bidders for the job. 

(c) Companies a and b are the only two bidders competing for the same job. 

(a) Since companies a and b may each win a job, the Venn diagram i is as shown 
in Fig. E2.7a. The overlapping region indicates that both companies a and b win 


jobs. In this case, events 4 and B are not mutually exclusive. 
`. (b) Company a may win the job, or company b may win the job, or some other , 


bidder will win the job. But, if company a wins the job, then event B will never occur. 
Therefore event Å precludes the occurrence of event B, and vice versa; hence 
events A and B are mutually exclusive. There is no overlapping region in the Venn 
diagram for events 4 and B, as shown in Fig. E2.76. In this case, the complementary 
event of (A VU B) means that neither company a nor company b wins the job. 
7c) In this case, the sample space only contains the two events A and B. If event 


2.2, ELEMENTS OF SET THEORY 31 


Figure E2.7a Figure E2.7b 


Figure E2.7c 


A docs not occur, that is, if company a loses, then we know definitely. that B has 
occurred. Events A and B are again mutually exclusive; also, A and Bare collectively 
exhaustive, that is 4 U B = S. Hence the corresponding Venn diagram would 
appear as in Fig. E2.7c. 


2.2.3. Operational rules 


Sets and the relationships among sets are governed by certain operational 
rules. In this connection, we adopt the following symbols to designate sets 
or their associated operations: 


U union 

n intersection 
C belongs to,.or is contained in 
> contains . . = 
E complement of E- 


We have seen in Section 2.2.2 that two or more sets can be combined in two 
ways—through union and intersection. These and the: process of taking the 
complement constitute the basic operations on sets. The rules that govern 
these operations aré the following: : 


Equality of sets. Two sets are equal if and only if both sets contain 
exactly the same sample points. On this basis, we immediately observe that 


AUS-À "n 
Ané-óé (2.1a) 


Figure 2.6 Event A Figure 2.7 Venn diagram of two 


sets A and B : 
Also, referring to Fig. 2.6, we have 
AUA-A 
ANA=A ; (2.15) 
Furthermore, l 
AuS-S 
AnS-A (2.1e) 


Complementary sets. From Fig. 2.1, we observe the following relative 


to an event E and its complement E: oe 


EuE-S 
EnE-$ (2.2) 
(É)-E 


(that is, the complement of the complementary event is the original event). 


Vn REED rüle.. Union and intersection of sets are commutative; 
that is, 


AUB=BUA 
AB = BA 


From the Venn diagram of Fig. 2:7, we see that A UB and BU A clearly 
contain the same set of sample points and, therefore, are equal subsets 
within S. Similarly, the same is true of AB and BA. f 


Associative rule. Union and intersection of sets are associative; that is, 
.(4ÙB)UC=A u (BUC) 
(AB)C = A (BC) 
The equality of the sëts (4 u B) UC and A U (BUC) is clear from the 


Figure 2.8 Venn diagrams for (A U B) U C and A U (B U C) 


Venn diagram of Fig. 2.8, whereas from Fig. 2.9, wë see that (AB)C = 
A(BC). 
Distributive rule. Union and intersection of sets are distributive; that is, 
(AUB)C = ACUBC 
(4B) UC = (4 uC) (BUC) 


In this case, the two equalities of the sets are verified by the Venn diagrams 
of Figs. 2.10 and 2.11, respectively. 

These operational rules imply that the rules governing the addition and 
multiplication of numbers apply to the union and intersection of sets. 
By assuming the following equivalences—union for addition and intersec- 


„tion for multiplication (that is, U— + and n—.X)—the rules'of conven- 


tional algebra then apply to operations of sets and events. Therefore, in 
accordance with the hierarchy of algebraic operations, intersection takes 
precedence over union of events, unless parenthetically indicated otherwise., 
It should be emphasized, however, that conventional algebraic operations, 
such as addition and multiplication, have no-meaning relative to sets 
and events. Moreover, there are operations and operational rules that 
apply to sets that have no counterparts in conventional algebra: of num- 


Double: Hatched Reglon s (AB) C , 
Figure 2.9 Venn diagrams for (AB)C and A(BC) 


Double Hatched Region = A (BC) ` 


Double Hatched Regionz(AUB)C `` Shaded Reglon-ACUBC . 


Figure 2.10 Venn diagrams for A(B U C) and AB U AC 


Tor Me SY BUC 


Shaded Region =ABUG Double-hotched Region =( AUC)(BUC) 


Figure 2.11" Venn diagrams for AB U C and (A UC)(BUC) « 


- bers. For example, AUA — A and ANA =A, Another case in point 
is the second of the distributive rules described above, which says that 


, (AuC) (BUC) = ABUAC UCBUCC 
oe l = ABUC 
whereas, in conventional algebra, we have ; 
(a 4- o) 4e) = ab + ac + cb + e sé abdo c 


Finally, another rule that-also has no counterpart in conventional algebra 
‘is the de Morgan’s rule, described below. i 


. de Morgan’s rule. Another rule in set theory is the de Morgan’s rule, 


which relates sets and their complements. For two events E, and E this . 


rule says that ` . 
ML ` EUE-ÉnE . 
To prove this relation, consider the two events Æ, and E; as shown in Fig. 
2.12. The unshaded region in: Fig. 2.12a is clearly’ E, U Es. The: Venn 
diagrams with Ë, and É» are individually shown in Fig. 2.125, the inter- 
séotiol of which is the double-hatclied region in Fig. 2.12c. Comparing 
Figs. 2.12a and 2.12c, we have the above relation, Fi UE = p 


———2-— man vex ae RERUN 4o 


ZX 


Double- hatched 


(c) Areo = LA 


Figure 2.12 Venn diagram for de Morgan’s rule 


Stated in general, de Morgan's rule is 
E,UE,U...UE, -ER..R. (2.3a) 
Applying Eq. 2.3a to By, É;, . .. , Ën, we have 


' Hence taking the complements of both sides of this equation, de Morgan’s 


rule can be stated also as 
EP:... En = U... uE, (2.3b) 


In view of Eqs. 2.8a and 2.3b we have the following duality relation. 
The complement of unions and intersections is equal to the intersections and 
unions of the respective complements. For example, 


AUBC = AnBC = A(BuC) = ABU AC 
(AUB)C = (AUB) uÓ = (4B) ue 
(EE; U Ey) (By U Ej) = Eb: U BE; = EAE, nA wg w 


EXAMPLE 2.8 


: A chain consists of two links, as shown in Fig. E2.8. Clearly, the chain will fail 
ME link breaks; thus, if E, = breakage of link 1, and E, = breakage of link 2, 
en ` E es 3 


Failure of chain = E, U E, 


36 BASIC PROBABILITY CONCEPTS 


F-— Link C1) ý Link (2) o F 


Figure E2.8 A two-link chain 


and no failure of the chain is, therefore, E; U E,. However, no failure of the chain 
also means that both links survive; that is, 
No failure of chain = £, N E, 
Therefore 
EL VE, = EE, 


which is an illustration of de Morgan’s rule. 


EXAMPLE 2.9 


The water supply for a city C comes from two sources A and B. The water is 
transported by a pipeline consisting of branches 1, 2, and 3, as shown in Fig. E2.9. 
Assume that either source alone is sufficient to supply the water for the city. 
Denote E, = failure of branch 1 

E, = failure of branch 2 
E, = failure of branch 3 | 


- Then shortage of water in the city would be caused by EjE; Y Es. Therefore, by 


de Morgan’s rule, no shortage means that : 
` EE, U E = (E, v E)E, 


in which (£, U Æ) means the availability of water at the junction, and Æ, means 
rio failure of branch 3. 


Source A 


Source B 
Figure E2.9 Water-supply system ' 


. 23. MATHEMATICS OF PROBABILITY 


.2.8.1. Basic axioms of probability; addition rule 


It may be pointed out that in all of our discussions thus far, we have tacitly 
assumed that a nonnegative measure, called probability, is associated: with 
every event. Implicitly, we have also assumed that such measures possess 
certain properties and follow certain operational rules. Formally, these 


2.3. MATHEMATICS OF PROBABILITY 37 


properties and rules are embodied in the mathematical theory of probability. 
As in other branches of mathematics, the theory of probability is based on 
certain fundamental assumptions, or axioms, as follows. 

For every event E in a sample space S, there is a probability 


P(E)20 (2.4) 
Secondly, the probability of the certain event S is 
P(S) -10 (2.5) 
Finally, for two events E; and E; that are mutually exclusive, 
P(E, U E:) = P(Ey) + P(E) (26) 


Equations 2.4 through 2.6 then constitute the basic axioms of probability 
theory. These are essential assumptions and therefore are not subject to 
proof. However, these axioms and the resulting theory must be consistent 
with and useful for real-world problems. In this latter regard, we observe 
that in essence, the probability of an event is a relative measure (that is, 
relative to other events in the same sample space) ; for this purpose, there- 
fore, it is convenient or natural to assume such a measure to be nonnegative 
as prescribed in Eq. 2.4. Moreover, becausé an event E is always defined 
within a prescribed sample space S, it is convenient.to normalize the 
probability of an event with respect to S (the certain event), as specified 
in Eq. 2.5. On the basis of Eqs. 2.4 and 2.5, it follows that the probability 
of an event E is bounded between 0 and 1.0; that is, 


S 0$ P(E) <10 


With regard to the third axiom, Eq. 2.6, we observe that from a relative 
frequency standpoint, if an event E, occurs m times among n repetitions 
of an experiment, and another event E» occurs m times (in which E; and 
E. are mutually exclusive), then E; or E» will have occurred (nı + n) 
times.. Thence, on the basis of relative frequency, we have f 


Phu) == 


= P(E) + P(E) 


It should be emphasized that thè mathematical theory of probability is 
concerned with the logical bases for the relationships among probability 
measures. All such relationships and the deductive character of the theory 
can be developed entirely on the basis of the three assumptions described in 


_ Eqs. 2.4 through 2.6. 


Applying Eq. 2.6 to E and its complement E, we have 
P(EUÉ) = P(E) + PË) 


" 


38 BASIC PKÜUBABILITY CONCEPTS 


e 


Figure 2.18 Union of E, and Ej) - 


but since E U E = S, we have, on the basis of Eq. 2.5, 
P(EUE) = P(S) = 1.0 


Hence the useful relation 
P(Ē) = 1 — P(E) (2.7) 
More generally, if E, and E; are not mutually exclusive, then > _ 
P(E, U E) = P(E) + POS) — P (EE) (2.8) 


Equation 2.8 follows from Eq. 2.6 by observing from Fig. 2.18 that 
FE, UE, = E, U É,E», where the events Ej and EE are mutually exclusive; 
thus, according to Eq. 2.6, 


P(E, UÉE) = P(E) + P(EE) 
But 
` EAE, U EE: = SE; = Es 
and EjE; and É,Es are mutually exclusive; hence 
P(E) = P(E) — P (EE) 
thus obtaining Eq. 2.8. 


EXAMPLE 2.10 


A contractor is starting on two new projects—jobs 1 and 2. There is some un- 
certainty on the completion time for each job; in one year, each job may be definitely 
completed, completion questionable, or definitely incomplete. Let us denote these 
situations as A, B, and C, respectively, for each job. Describe the sample space for 
the state of completion of the two jobs; that is, describe all the possible situations of 
jobs i and 2 after one year. i : 

if each of the possibilities for the two jobs is equally likely to occur at the end of 
one year, what is the probability that exactly one job will be definitely completed in 
one year? 


2.5. MATHEMATICS OF PROBABILITY 39 


Figure E2.10a Sample space Figure E2.10b i 


Sample space is shown in Fig. E2.10a. Since the event of exactly one job completed 
contains the four sample points AB, AC, BA, and CA, this probability is equal to 
4 x 1/9 = 4[9. i 

In this problem, if we let E, be the event that job 1 is definitely completed, and E; 
that job 2 is definitely completed, then 


j Ey > (AA, AB, AC} 
E, > (4A, BA, CA} 


The Venn diagram with events E, and £; will appear as in Fig. 2.105. If the sample 
points are equally likely to occur, then P(E,) = 3/9, P(E,) = 3/9, and according 


` to Eq. 2.8 


P(E, Y Ej) = 3/9 + 3/9 — 1/9 = 5/9 
which can be verified since (E, U E;) > (44, AB, AC, BA, CA}. 


EXAMPLE 2.11 


For the purpose of designing the left-turn lane (for eastbound traffic) in Example 
2.2, the 60 observations (made at random) of the number of cars waiting for left 
turns at the,intersection, yielded the following results: 


—r 


No.ofcars No. of observations Relative frequency 


4 . 4160 
16 16/60 
20 20/60 
14 14/60 

3 '— 8/60 > 
2 2/60 

1 1/60 
0 0 

0 0 


^" 0-100 &ul-OoOo 


40: BASIC PROBABILITY CONCEPTS 


Let 
E, — more than 2 cars waiting for left turns 


E, = 2 to 4 cars waiting for left turns 


Since the number of cars waiting for left turns are mutually exclusive events, a 
simple extension of Eq. 2.6 (see Eq. 2.6a, pg. 41) and using the above relative fre- 


quencies tó represent the corresponding probabilities, we obtain 
14 3 2 1 20 


PED = + gg + 6o * 0 — 60 
whereas 
20 14,3 37 
PED = $5 $5 + 0 =H 


Also, in terms of the number of cars waiting for left turns, 


EQ? (3,4) 
and thus 
14 | 3 17 


PEE) 7 55 * ag 7 eg 
Then, according to Eq. 2.8, ` 


20 37 17 40 
P(E, V E) = go is go 7 6b 


‘In this case, we also observe that 


É, VE, > {2,3,4,...} 
Hence 


20 14,3 ,2 ,1 40 
P(E, Y E) = 5 + go tate TE 


Thus, verifying the result obtained above using Eq. 2.8. 


EXAMPLE 2.12 


In Example 2.6, events are represented by areas in the sample space of R and Ry, 
as shown in the Venn diagram of Fig. E2.12. If the probability of an event is pro- 
portional to its “area” (this corresponds to the assumption that the sample points 


are equally likely), we obtain the following. 


Total “area” of sample space = 31300) — (100)*] 


. — 40,000 
Then, referring to Fig. E2.12, we have’ 
? 1 2 
_ $Qo0y 1 
BA) = 40,000 2 
Similarly, 
1 
P(B) = 5 


2.4. MATHEMATICS UF 'HUDADAILIA 1 DTI 


Figure E2.12 


whereas : 3 
P(AB) = en = i : 
and . 
P(A U B) = oo E : 
By Eq. 2.8, we also obtain Pe uti. d 
ill E 
For three events Ei, E», Es, 
P(E, UE, U Ey) = PLA U Es) U Es] z 
= P(E, UE) + POR) — PLA UB) Bs] A, 


= PA) + P(E:) + P(E;) — P (EE) — P (EEs) 
— P(ExEs) + P (AESEs) (2.9) 


The preceding procedure may be extended to the union of any number of 
events; however, for n.events, the probability of the union may be obtained 
more conveniently using de Morgan's rule, as follows: 


P(E UEU... UE) =1— PRU ESU.. UE.) i 
= 1 — P(E... En) . :(2.10) 


However, if the » events are mutually exclusive, extension of the third 
proposition (Eq. 2.6) yields 


P(E,UB,U...UE,) = E P(E) (2.6a) 


del 


BASIC PROBABILITY CONCEPTS 


Figure E2.13 


EXAMPLE 2.13 


Under the load F, the probabilities of failure of the individual members a, b, and 
€ of the truss shown in Fig. E2.13 are 0.05, 0.04, and 0.03, respectively. The failure 
of any member(s) will constitute failure of the truss.- 

Assuming that failures of the individual members are statistically independent, 
so that the failure probability of two or more members is equal to the product of the 
respective member probabilities (see Eq. 2.15, pg. 47), determine the failure prob- 
ability of the truss. 

Denoting the failure events of the three members as A, B, and C, we have P(A) = 
0.05, P(B) = 0.04, and P(C) = 0.03. And with the assumption of statistical inde- 
pendence, ' 

P(AB) = (0.05)(0.04) = 0.0020 


P(AC) = (0.05)(0.03) = 0.0015 
; P(BC) = (0.04)(0.03) = 0.0012 
and í 
P(ABC) = (0:05)(0.04)(0.03) = 0.00006 ` 
Then, according to Eq. 2.9, 
P(failure of truss) =-P(A U BUC) 
= 0.05 + 0.04 + 0.03 
0.0020 — 0.0015 — 0.0012 
+0.00006 
= 0.11536 
This probability may also be obtained (more conveniently) with Eq. 2:10 as follows: 
P(A U B U C) «1 — (ABC) 
` In the present case (see Eq. 2.16), we have 


P(ABC) = PDPP 
A = (1 — 0.05)(1 — 0.04)(1 — 0.03) = 0.88464 
Hence : 


- P(failure) = 1 — 0.88464 = 0.11536 


2,3. MATHEMATICS OF PROBABILITY 43 


2.3.2. Conditional probability; multiplication rule . 

The probability of an event may depend on the occurrence (or non- 
occurrence) of another event. If this dependence. exists, the associated 
probability is a conditional probability. 


Figure 2.14 Reconstituted sample space Es - 


D 


‘In the sample space of Fig. 2.14, the conditional probability of Ei as- 
suming Æ: has occurred, denoted P(E:|E;), means the likelihood of : 
realizing a sample point in E, assuming that it belongs to Hy. Effectively, in 
other words, we are interested in the event E; within the reconstituted 
sample space Es. Hence, with the appropriate normalization, we obtain the 
conditional probability of Ei given Es as : 


P (EVE) 
P(E) 


To clarify this concept, consider the following examples. 


P(E; | E) = (2.11) 


EXAMPLE 2.14 


Consider a 100-km (kilometer) highway, and assume that the road condition and 
traffic volume are uniform throughout the 100-km distance, so that accidents are 
equally likely to occur anywhere on the highway. Define the events 


A = an accident in kilometers 0 to 30 
B - an accident in kilometers 20 to 60 


Since accidents are equally likely anywhere along the highway, it may be assumed 
that the probability of an accident in a given interval of the highway is proportional 
to the distance of the interval. Therefore, if an accident occurs on this 100-km 
highway, - 

, 40 


30 
P(A) 7106 and P(B) = 100 


Now let us pose the question: “if an accident occurs in the interval Q0, 60), 
what is the probability of the event 4?” In this case, we are interested in the prob- 


ability of A on the condition that B has occurred; this is simply the proportion of 
the distance that belongs to B within which A is also realized. Clearly, from Fig. 


44 BASIC PROBABILITY CONCEPTS 


B 
e—————— 
A : 
mi 
o 20 30 60 100 km 
Figure E2.14 


E2.14, this conditional probability is 


n _ 10/100 
= 40/100 


But, in this case, 10/100 = P(AB), and 40/100 = P(B), thus illustrating Eq. 2.11. 


P(A|B) = 


EXAMPLE 2.15 

Consider again the problem of three bulldozers, described earlier in Example 2.1 

. Let 
F — event that the first bulldozer is operational after 6 months 
E = 2 bulldozers are operational after 6 months 


If the sample points are all equally likely, then referring to the Venn diagram shown 
| in Fig. E2.15, the conditional probability of E y Fis 


P(E| F) == 
This is simply the ratio of the number of m points in EF relative to those in F, 
illustrating therefore the notion that Fis taken as the new “sample space." Similarly, 
the conditional probability of F given E would be 
PRF|E-: 
‘However, if the sample points are not equally m then'the associated probability 


measures must be used in the calculation of the conditional probability. For example, 
if the probability of a bulldozer's operating at least 6 months is 80%, then (assuming 


Figure E2.15 


relative to those o: 


2.3. MATHEMATICS OF PROBABILITY | 45 


statistical independence; see Eq. 2.15, pg. 47) the probabilities of the various sample 
points will be as follows: i 
s P(GGG) = 0.512 


P(GGB) = 0.128 
P(GBB) = 0.032 
P(BBB) = 0.008 
P(BGG) = 0.128 
P(BBG) = 0.032 
P(GBG) = 0.128 
P(BGB) = 0.032 


In this case, rely must reflect the probabilities of the sample points in EF 
the sample points in F. Accordingly, we have ` 


" P(GGB U GBG) _ P(EF) 
PE| P) = sGGG o GGB O GBG U GBB) ~ FU) 
0.128 + 0.128 0.256 


` vT 9512 + 0.128 + 0.128 0.032 = g0 7 092 


It may be emphasized that the conditional probability is mansos a 
generalization of the probability of an event. When we speak of the proba- 


bility of an event E, it is- implicitly conditioned on the sample space S. 


[This is illustrated in Example 2.14; the probabilities P(A) and P(B) are 
based on the condition that an accident occurs in the 100-km highway.] 
To be more explicit, P a should be written ' 


. PS) 
P(E| 8) PS) 

But since ES = E, and P(S) = 1.0, 
P(E|S)-P(E). 


In other words, conditioning on the sample space S is presumed to be under- 
stood; however, when the probability is conditioned.on an event other than 
the original sample space, the reconstituted “sample space" must be made 
explicit. 
We observe also that 
P(ES) | P(E,ES) 


P(E, | Ey) + P(E: | Ex) = Pa | PE) 


"8 Pa! (E U Ej)E] 


e : — Ps) 
C P(E) 


-10 


` 46 ' BASIC PROBABILITY CONCEPTS 


Therefore , 
P(E, |E) 21— P(E, | Ee) (2.12) 


which is a generalization of Eq. 2.7. It is important to recognize that in . 


Eq. 2.12 the conditioning event E, is the reconstituted sample space; for 
this reason we must make sure, when applying Eq. 2.12, that the event 
(for example, Ei) arid its complement refer to the same reconstituted 
sample space E». For example, observe the following: 

P(E, | Ee) #1 — P(E | Ej) 


P(E, | Ey) #1 — P(E,| E) 


EXAMPLE 2.16 


It has been observed that vehicles approaching a certain intersection in a given 


direction are twice as likely to go straight ahead than to make a right turn; also, 
left turns are only half as likely as right turns. 

Assume that these conditions are valid for any vehicle, Then if a vehicle approaches 
the intersection in the indicated direction, we can ask the following. 


(a) What are all the possibilities (that is, the different directions for the vehicle 
to take)? n f 


Straight ahead = E, 
Turn right = E, 
Turn left = E, 
(b) What are the respective probabilities? 
4 2 1 
PE)=3 PE)=5 PEŒD=} 
(c) What is the probability of a right turn if a car is definitely going to make a 
turn? 
PIE(E, U E3)] es PE) = a 22 
P(EUE)  PEUE) 3° 3 


On the other hand, if a vehicle is definitely turning at the intersection, the probability 
that it will nor turn right is 


P(E, | E, U E) = 1 — P| E, U Es) 


P(E, | Ey U Es) = 


win 


Statistical independence. ‘If the occurrence (or nonoccurrence) of one 
. event does not affect the probability of another event, the two events are 
statistically independent. Therefore, if FE, and Es are statistically 


2.5. MATHEMATICS OF PROBABILITY 


independent,* 


P(E:| E) = P(E) 


and . (2.13) 


P(E, | E) = P(A) 

Multiplication rule. From Eq. 2.11, the probability of the joint event 
ELE; is 

P(E\E2) = P(E; | Ez) P (E2) "e 
or (2.14) 

P(EE, = P(E: | Ex) P (E) 
IE and Fs are statistically independent events, then this multiplication 
rule becomes j . 


PEE) = PE) P(E) (2.15) 
For three events, the multiplication rule is 
^ PES = PBs | Hala) POS] Bs) P (Bs) (2.140) ` 
and if the events are statistically independent, 
i P(EE&E, = P(E) P (E2) P (Es) (2.152) 


We would expect that if E, and E» are statistically independent, their 
complements #; and Hz would also be statistically independent. This can 
be verified in the case of two events as follows: 


P(E) = P(A, UE) = 1 — P(E, U Ex) 
. =1- (POR) + P(E) — P(E) PE] 
= [1 — PG) - PE] 
3 = P(E) PĒ) - _ (2.16) 
"Finally, we should emphasize that all the mathematical rules pertaining 


to probability apply equally to conditional probabilities defined within the 


same reconstituted sample space, including specifically the following: 
P(E,UES| A) = P| A) + P(B2| A) — PQAES| A) (2.17) 
P(EE,|A) = PEOA[ F) | A] PCS | A) ` (2.18) 


* This way of defining statistical independence is intuitively inore direct. Although tis 
is somewhat unconventional, because statistical independence is usually defined mathe- 
matically in the form of Eq. 2.15, the Mathematical Associatión of America (1972) 
suggested the use of the conditional definition of statistical independence—that is, 
Eq. 2.13. : 


50 BASIC PROBABILITY CONCEPTS 


Y Ultrasonic Readings At These Locations 


1/10 mile 1/10 mile 


Figure E2.19 


Let . 
G = actual thickness of pavement is at least 7.5 in. 


A = measured thickness > 7.5 in. 


_ The statement “reliability of the ultrasonics test is 80%” may be interpreted to mean 


P(G| A) = 0.80 
PG | 4) = 0.80 
P(G| A) = 1 — 0,80 = 0.20 


and 


Hence 


Based on the contractor’s past record, we may assume that 90% of his work will 


have satisfactory ultrasonics readings; hence 
| P(A) = 0.90 
The event of interest is GA; its probability, therefore, is 


P(GA) = PG | A)P(A) 
= (0.80)(0.90) = 0.72 


(b) What is the probability that a section is poorly constructed (that is, has thick- 


: ness less than 7.5 in.) but will be accepted on the basis of the ultrasonics test? 


In this case, we have É "s 
P(GA) = P(G| DPA) 
= (0.20)(0.90) = 0.18 


EXAMPLE 2.20 


The settlement problem of a steel frame may be idealized as follows. 4 and B 
represent two footings resting on soil (Fig. E2.20). Each footing may either remain 
at the original level or settle 5 cm. The probability of settlement in each footing is 
0.1. However, the probability that a footing will settle, given that the other has 
settled, is 0.8. i E 
. (a) The possible conditions.of the two footings are as follows: 

AB A settles, B settles 
A AB A does not settle, B settles 
E AB- A settles, B does not settle 


AB A does not settle, B does not settle 


_ 23. MATHEMATICS OF PROBABILITY: — 51 


Figure E2.20 


(b) The probability of settlement (that is, either A or B will settle) is 
P(A UB) = P(A) + P(B) — P(AB) 
= P(A) +.P(B) — P(A)P(B| A) 
, =0.1 + 0.1 — 0.1 x 0.8 = 0.12 . 
(c) If we are interested in the event E that differential settlement (that is, a differ- 


ence in the level of the two footings) occurs, the event will consist of AB and AB. 
Since these two events are mutually exclusive, 


P(E) = P(AB) + P(AB) 
= P(B)P(A | B) + P(A)P(B | A) 
=O.) — P| B] + OD — PG | 41 
= (0.1)(0.2) + (0.1)(0.2) = 0.04 


EXAMPLE 2.21 


The foundation of a tall building may fail either from bearing capacity, or by 
excessive settlement. Let B and S represent the respective failure modes. 1f P(B) = 
0.001 and P(S) = 0.008, and P(B| S) = probability of failure in bearing capacity 
given that it has excessive settlement — 0.1, determine (a) the probability of failure 
of the foundation; (b) the probability that the building has excessive settlement but 
no failure in bearing capacity. : i 

(à). P(F) = P(B U S) —P(B) + P(S) ~ P(B ^S) 

: = P(B) + P(S) — PB | S)P(S) - 
= 0.001 + 0.008. — (0.1)(0.008) 
_ = 0.009 — 0.0008 = 0.0082 


b) P(S ^ B) = P(B| S)P(S) 
i =[ -PEHO . , 
i )-(f— 0.1)(0.008) 209 x 0.008 = 0.0072 
In this problem, the conditional probability F| S) cannot be larger than 1/8; 
can you explain why? : ae 


D 


52. BASIC PROBABILITY CONCEPTS | 


_ EXAMPLE 222 


There are two streams flowing past an industrial plant. The dissolved oxygen, DO, 


Jevel in the water downstream is an indication of the degree of pollution caused by : 


-. the waste dumped.from the industrial plant. Let A denote the event that stream a is 
polluted, and B the event that stream b is polluted. From measurements taken on the 
DO level of each stream over the last year, it was determined that in a given day 


P(A) -i and P(B) =; 


and the probability that at least one stream will be polluted ir in any given day is 
P(A VB) = 

(a) paei the probability that stream a is also polluted given that stream b 
is polluted. 

(b) Determine the probability that stream b is also polluted given that stream a 
is polluted. 

First, we compute the probability that both streams are polluted. Since 

P(A U B) = P(A) + P(B) — P(A ^ B) 

we have . 
: P(A A B) = P(A) + P(B) ~ P(A © B) 


i 


Therefore 


and T i 
P(A AB) _ 7/20 


P| A) = P(A) EE "8. 


In other words, stream 5 is very likely to be polluted when stream a is polluted, 
whereas chances are less than 50% that stream a will be polluted when stream b 
is polluted. : 


2.3.3. Theorem of total probability 


Sometimes the probability of an event A cannot be determined directly; 
However, its occurrence is always accompanied by the occurrence of other 
events E, i = 1,2,.:.,n, such that the probability. of A will depend on 
which of the events E, has occurred. In such a ease the probability of 4 
will be an.expected probability (that is, the average probability weighted 
by.those of E;). Such problems require the theorem of total probability. By 
way of introduetion, eonsider the following example. 


` 


EXAMPLE 2.23 


Suppose that there is consideráble uncertainty concerning the fate of the U.S. 
supersonic transport (SST) project. Whether or not the United States will have a 


2.3. MATHEMATICS OF PROBABILITY 53 


commercial SST by 1980 will depend on the result of the presidential election in 
1976. Suppose also that if the Democrats win the election, the probability of an 
SST by 1980 is only: 2075, whereas if the Republicans win in 1976, this probability 
will be 70%. 

Clearly, without knowing the party that will win the 1976 election, we cannot say 
whether the required probability will be 20% or 70%. However, if the two major 
parties have equal chances of winning in 1976, this tis probability would be the average 
of 0.20 and 0.70; OF 


P(SST by 1980) = 0.2(0.5) + 0.7(0.5) — 0.45 


If the Republicans are favored by 3 to'2 to win in 1976, it would be reasonable to 
weigh the preceding probabilities by the respective odds of winning the election; 
thus 
- ‘P(SST by 1980) = 0.2(0.4) + 0.7(0.6) = 0.50 


whereas, if the Democrats are favored 3 to 2 to win the election, the corresponding 
probability wouid be 


P(SST in 1980) = 0.2(0.6) +.0.7(0.4) = 0.40 


Formally, — € n mutually exclusive and collectively exhaustive 
events Ej E», ..., En; that is, Ex UESU...UE, = S. Then if A is an 
event also in thes same sample space (see Fig. 2.15), we Bave 


A=AS 
= A (E, U EU... U En) 


= AE,U AR: U... UAB, 
where AEy AEs, ..., AE, are also mutually exclusive, as can be seen in 
the Venn diagram of Fig. 2.15. Then . 


P(A) = P(AE) + P(AE) + +++ + P(AE,) 


and by virtue. of the sultiplontion ds Eq. 2.14, we obtain the total 
probability theorem 


P(A) = P(A|E) P(x) + P(A | E) POS) + ee «Pa | E) PŒ) 
(2. 19) 


ZA 


c—— 


So 


c———4 

———3 

——J 
c 


E 
EE) 
SS 


TI 


Figure 2.15 Venn diagram with events A and Ej, Es... , E, 


E BASIC PROBABILITY CONCEPTS 


In ápplying the total probability theorem, it is important to observe that 
the events Fx, i = 1, 2, .. . , n, must be mutually exclusive and collectively 
exhaustive. í : 


EXAMPLE 2.24 


Figure E2.24 shows one direction of two interstate highways Z, and 7, merging 
into fy. Assume that J, and J, have equal capacities; the rush-hour traffic, however, 


. is somewhat different, so that during rush hours 


PH) = P(excessive traffic inf) =10% 
P(H) = P(excessive traffic in) = 20% | 


“Also, denoting P(/, | 1;) as the probability of excessive traffic in J}, given exces- 
sive traffic in Jz, we have - 


- Pd, | 19 = 50% 
and 
: Ps |I) = 100% . 

(a) If the capacity of 7, is the same as that of J, or I, what is the probability 
of excessive traffic in 74? Assume that when /, and 7, are carrying less than their 
traffic capacities, Ig may be exceeded with probability 20%. Dl 

First, we.observe. that this probability will depend on the traffic conditions in I, 

` and I, which may be 1,15, Aly, Lb, or fT, with respective probabilities as follows: 
PIAL) = P| IPU) = 0.5 x 0.2 = 0.10 
PUL) = PU, | RPG). : : 
= [1 — P, | IP = 0.5(0.2) = 0.10 
PUL) = P| PCs) s 
200 =D PG [APG = 0 
PGR) = Y — Ph) + PIU + PEM 
=1—(0.1 +0.1 +0) = 0.80 


Clearly, the traffic in I, will be excessive when the traffic in i or I, or both, is 
excessive, Also, we have P(Ig | A5) = 0.20. : 


* Then : E 
PU) = Ps | GPR) + POS | DDPA) + Ps | hPL) 
Ps | IDP) l 
= 1.00(0.10) + 1.00(0.10) -+ 1.00(0) + 0.20(0.80) 
= 0.36 i U 


"Figure E2.24 


ea 


: 2.3.. MATHEMATICS OF PROBABILITY 55 


(b) If the capacity of I; is twice that of 7, or Iz, what is the probability of exces- 
sive traffic in I5? Assume that if only 7; or J, has excessive traffic, the capacity of Ig 
may be exceeded with probability of 15%. : 

Therefore PU, | 1E) = PQs| 15) = 0.15. Furthermore, it is obvious that Zs 
will have excessive traffic when J, and J, both have excessive traffic. Then, in this 
case, : ; : 

P(I) = Ps | BDP) + POs | BP) + PH | nPE) 
PS | I) P) 
= 1.00(0.10) + 0.15(0.10) + 0.15(0) + 0(0.80) 
= 0.115 


EXAMPLE 2.25 


Suppose that in any given year, the probability of damaging storms (that. is, 
storms with wind speed exceeding say 60 mph) in the county of Champaign is 0,20. 
During such a storm, if not accompanied by tornadoes, the probability of structural 
failures in the city of Urbana (which is in Champaign County) is 0.10. 

When a storm occurs in the county, the probability that it will be accompanied by 
a tornado is 0.25, and the probability that this tornado will hit the city of Urbana 
is 0.05. Assume that tornadoes occur only during a storm, and when the city is hit 
by a tornado it is certain to cause structural failures, whereas the probability of 
structural failures in the city when a tornado occurs in the county but does not hit 
the city is 0.10. 

Calculate the probability of structural failures in the city of Urbana in a period 
of one year. 

Define the following events:. 


F = failure of structures in city of Urbana 
` S = storm in Champaign County 
T = tornado in Champaign County 
H = tornado hitting city of Urbana 
Clearly, the events ST, ST, ST, and ST are mutually exclusive and collectively 
exhaustive; hence the probability of structural failures in the city is: 
| P(F) = P(F | ST)P(ST) + PCF | ST)PGST) + PE | ST)P(ST). ` 
+ PF | ST)P(ST) j 


where 
xd P(E | ST) = PICE | ST)| HIPH) + PIE] ST) | HIPC) 
: = 1.00(0.05) + 0.10(0.95) = 0.145 
P(F| ST) = 0.10 i 
P(E | ST) = unknown; not needed in this problem 
P(F| ST) =0 
Also 
P(ST) = P(T | S)P(S) = 0.25(0.20) = 0.05 
P(ST) = P(T| S)P(S) = 0.75(0.20) = 0.15 
P(ST) =0- : 
- P(ST) = 0.80 
Therefore 


PF) = 0.145(0.05) + 0.10(0.15) + (X0) + 0(0.80) 
` = 0.00725 + 0.015 = 0.0222 t 


' whereas 


56 BASIC PROBABILITY CONCEPTS 


2.3.4. Bayes’ theorem 


In the situation underlying the total probability theorem (see Section: 


2.3.3), if the event A occurred, what is the probability that a particular E; 
also occurred? This may be considered as a “reverse” probability. 
Applying Eq. 2.14 to the joint event AZ, we have : 
P(A| E)P(E) = P(E | A)P(A) 
Therefore we obtain the desired probability f 


P(A | E) P (E) 


EMG A) =- P(A) > (2.20) 


; which i is known as Bayes' theorem. If P(À) is expanded using the total 


probability theoreni, Eq. 2.20 becomes 


| PGA) = PU BFE) (2.20) 
È P(A | E,)P(E;) ` l 


_ EXAMPLE 2.26 


¿è Referring again to the pavement problem of Example 2.19, we might ask, “What ` 


is the probability that if a section is well, constructed, it will be accepted. on the 
basis of thé ultrasonics test?” 
This means P(A | G), which according to Eq. 2. 20, is given by 


P(A |G) = EEC LARA) Ton (A) 


‘From Example 2.19, we have 


P(G | 4) = 0.80 and P(A) = 0.90 


To determine P(G), we observe that A and A are mutually exclusive and collectively 


exhaustive; hence, according to Eq. 2.19, 
P(G) = P(G | APA) + P(G | APA 
= 0.80(0.90) + (0.20)(0.10) 
f = 0.74 
Therefore the required probability is 
f 0.80(0.90) 


P497 


= 0.973 


P(A| o =1 — 0.973 = 0.027 
which is the probability that a well-constructed section may be rejected on the basis 


2.3.. MATHEMATICS OF PROBABILITY 57 


of the ultrasonics test. These latter probabilities should be compared and contrasted 
with P(G | A) and P(G | A) of Example 2.19; the difference in meaning is not trivial. 


EXAMPLE 2.27 


The air pollution in a city is caused mainly by industrial and automobile ex- 
hausts. In the next 5 years, the chances of successfully controlling these two sources 
of pollution are, respectively, 75% and 60%: Assume that if only one of the two 
sources is successfully controlled, the probability of bringing the pollution below 
acceptable level would be 80%. 

(a) What is the probability of successfully controlling air pollution in the next 
5 years? 

(b) If, in the next 5 years, the pollution level is not sufficiently controlled, what 
is the probability that it is entirely caused by the failure to control autothobile 
exhaust? 

Assuming statistical independence paneer controlling industrial (I) and auto- 
mobile (A) exhausts, we have : 

P(AI) = 0.75 x 0.60 = 045 ` 
P(AI) = 0.25 x 0.60 = 0.15 > 
P(ÀI) = 0.75 x 0.40 = 0.30 
~ P(AD) = 0.25 x:0.40 = 0.10 
Then, denoting E as the event of controlling air pollution, 

(a) P(E) = T. -00(0. 45) + 0.80(0.15) + 0.80(0.30) + ae 10) 

: 081, ; 
P(E| AD PAD 0.20 x 0.30 
BE) EX 


N 


(b) Par E- 
= 0.32 


(QA related: question: Jf pollution is not controlled, “What is the probability 
that control'of automobile exhaust was not-successful? This calls for PA | E); but. 
P(A| E) = P(AI v AT| E) 
= P(AI | E) + PAT | E) 
PEL ADP(D , FE | ADP(AD 
PE) ` P(E): 
0.20(0.30) + 1.00(0.10) , 
0.19 0.19 


= 0.84 
whereas : | TPA) f " i 
a i z P(E | TAPI. P(E| TAPIA 
PU | £) = PUA VIA| £) = : AB pl Lark 
.,920(015) + 1.0000.10) _ 013: 9 
0.19 019 ^" 
EXAMPLE 2.28 © 


“ Aggregates’ for construction are ctdered from two differént companies. Company 


` A delivers $00 loads each day, out of which 375 do not satisfy the Specified quales 


26 . BASIC PROBABILITY CONCEPTS 


Company B supplies 400 loads each day, out of which only 1% are substandard. 
(a) What is the probability that a load of aggregate picked at random came from 
company A? . ; 


.. (b) What is the probability that a load of aggregate picked at random will not 


pass the specified standard? i 
(c).1f a load of aggregate was found to be substandard, what is tlie probability 


that it came from company A? 


' Solutions: : gd . : ; 
(a) Since there are altogether 1000 loads, out of which 600 came from company 


_ A, the probability that a load picked at random comes from company A is 


6007. 
: E ; 00 ~ . 
(b) The substandard aggregate may come from either company A or company B. 


We may apply the theorem. of total probability to. compute the probability of the 
event E, that is, picking a load of substandard aggregate: 


P(E) = P(E | ÁjP(A) + P(E | B)PCB) 


P(A) = 0.6 


- ` 600. 400 
= 0.03 X 195g +.0.01 X100 
(= 0.018 + 0.004 = 0.022 š f 


” (© If the load of aggregate picked at, random is substandard, ‘tle probability 


* that it comes from company 4 is no longer 0.6 as in (a), because the sample space 
-is changed. Instead of 1000 loads, the new sample space consists of only substandard 
aggregate loads which is [LEN OEC uS 


(0.03 x 600'+ 0.01 x 400) = 18 +4 —22 loads 
out of which only 18 are from company A. Hence : 
i : 0.03.x 600: . : 
0.03 x 600 -- 0.01 x 400 


18 
== 0.818 


© P(A | the aggregate is substandard) = 


Since the aggregate of company A is of poorer quality than that of B, the additional 
information that a load of aggregate is substandard increases the probability that 


. such a load comes from A. 


. Bayes’ theorem is useful for revising or updating the calculated proba- 
bility as more data and information become available. The following 


examples will serve to illustrate this, including how prior information _ 


(which may be'based on judgmental assumptions) is combined with test’ 


results-to update the calculated probability. 


EXAMPLE 229. 


Consider a pile foundation, in which pile groups are used to support the individual 


: column footings. Each of the pile.group is designed to support a load of 200 tons.” 


2.5. MATHEMATICS OF PROBABILITY 59 


Under normal condition, this is quite safe. However, on rare occasions the load 
may reach as high as 300 tons. The foundation engineer wished to know the prob- 
ability that a pile group can carry this extreme load of up to 300 tons. 

Based on previous experience with similar pile foundations, supplemented with 
blow counts and soil tests, the engineer estimated a probability of 0.70 that any 
pile group can support a 300-ton load. Also, among those that have capacity less 
than 300 tons, 50% failed at loads less than 280 tons. "uu 
* To improve the estimated probability, the foundation engineer ordered one pile 
group to be proof-loaded to 280 tons. If the pile group survives the specified proof 
load, the probability of the pile group supporting a load of 300 tons can be updated 
as follows. á " 

Let 

A = event that the capacity of pile group > 300 tons 


T = event of a successful proof load. - . 
Then according to the information given above, P(T] A) = 0.5, and P(A) = 0.70; 
and clearly P(T | A) = 1.0. Bayes’ theorem then gives SN 
: P(T| A)P(A) 
P(T| A)P(A) + P(T| A)P(A) 
0090970 gan 
~ L00(0.70 + 0.5(0.3) ~ 0.833 


Therefore, if the proof test is successful, the required probability is increased, from 
0.7 to 0,833. . d 


P(A|7): 


EXAMPLE 2.30 


Aggregates for a highway pavement are extracted from a gravel pit. Based on 
experience with the material from this area, it.is known that the probabilities are 


P(G) = P (good-quality aggregate) = 0.70 
P(G) = P (poor-quality aggregate) = 0.30 


In order to improve this prior information, the engineer tested. a sample of the 
aggregate. However, the test method is not perfectly reliable—the probability that 
a perfectly good-quality aggregate will pass the test is 80%, whereas, the probability 
of a poor-quality aggregate passing the test is 10%. — ' : N 
_ Let T; denote the event that a sample passes the test. Then, if a sample does 
indeed pass the test, the updated probability is ` e ; 
PCT, | OPG) 
P(T, | G)P(G) + P(T,|G)PG) ` 
a (0.8)(0.7) 2 E 
; A (0.8.0.7) + (0.1)(0.3) ~ 0 

Therefore, with a positive test result, the probability of. good-quality aggregate is 
increased significantly —from 70% to 9575. - : A 

- Suppose that the engineer is not satisfied with just one sample test, and another 
sample is tested. If this additional sample also passes the test, the probability is 


PB(G|T) = 


60 BASIC PROBABILITY CONCEPTS 


updated further as follows: L4 | s 

P(T, | G)P(G) 

FG|72 = ORG + PU, | OPO 
. (0.8)(0.95) : 

= (0:80.95) + Q.1)(0.05) 

This updating is performed sequentially. The updating may also be performed in a 

single step using the two test results together. In this latter case, we have 


R E P(1,| G)P(G) 
PG | TD) = FTP + Pars; | ORG) 
(0.8)(0.8)(0.7) 
(0.8)(0.8)(0.7) + (0.1)(0.D(0.3) 


= 0.993 


= 0,993 


- which is clearly the same as the result obtained sequentially above, as it should be. 
24. CONCLUDING REMARKS 


In this chapter, we learn that a probabilistic problem involves the deter- 
mination of the probability of an event within an exhaustive set of possi- 
bilities (or possibility space). Two things are paramount in the formulation 
and solution of such problems: (1) the definition of the possibility space 
and the identification of the event within this space; and (2) the evalua- 
tion of the probability of the event. The relevant mathematical bases 
useful for these purposes are the theory of sets and the theory of probability. 
In this chapter, the basic elements of both theories are developed in ele- 
mentary and nonabstract terms, and are illustrated with physical problems. 

Defined in the context of ‘sets, events can be combined to obtain other 
events via the operational rules of sets and subsets; basically, these con- 
sist of the union and intersection, of. two or-more. events including their 
complements. Similarly, the operational rules of the theory of probability 
provide the bases for the deductive relationships among probabilities of 
different events within a given possibility space; specifically, these consist 
of the addition rule, the multiplication rule, the theorem of total probability, 
and Bayes’ theorem. " ^ á 

In essence; the concepts developed i in this chapter constitute the funda- 
mentals of-applied probability. In Chapters 3 and 4, additional analytical 
tools will be developed based on these fundamental concepts. 


PROBLEMS 


Sections 2.1 & 22. 


24 The possible settlements for the three supports ofa bridge Shown į in Fig. P2.1 
^, are as follows: i 
support 4—oi in., rin., 2i in. 


support B—O in., 2 in. PE 
support C—O in., 1 in., 2 in. 


PROBLEMS 61: 


Possible Travel Times 


3,4, 5 hv 


Figure P2.1 


Figure P2.2 


(a) Identify the sample space representing all possible settlements of the 
three supports; for example (1, 0,2) means A settles 1 in., B settles ` 
0 in., and C settles 2 in: 

(b) If E is the event of 2 in. differential settlement between any adjacent 
supports of the bridge, determine the sample points of E. 


2.2 Figure P2.2 shows a network of highways connecting the cities 1,2, . . . , 9. 
(a) Identify the sample. space representing all possible routes. between 
cities | and 9. 
(b) The possible travel times between any two connecting nodes are as 
indicated in Fig. P2.2 (for example, from 2 to 9, the possible travel 
times are 3, 4, 5 hr). What are the possible travel times between 1 and 9 
through route (I) -- @ — (9)? How about through route (]) ^ > 
09-99 
23 A6mx48m apartment building may be divided into I-, 2-, or 3-bedroom 
units, or combinations thereof (Fig. P2.3). If 1- bedroom units are each’ 
6 m.x 6 m, 2-bedroom units are each 6 m x 12 m, and 3-bedroom units are 
each 6m x 16 m, how may the apartment building be subdivided into one or 
more types of units? 


2.4 A left-turn pocket of length 60 ft is planned at a street intersection. Assume. 
that only two types of vehicles will be using it; a type-A a will occupy 
15 ft of the pocket, whereas a type-B vehicle will occupy 30 ft. 
(a) Identify all the possible combinations of types A and B vehicles waiting 
for left turns from the pocket. 
(b) Group these possibilities into events of 1, 2, 3, and 4 vehicles waiting . 
for left turns. 
2.5 Strong y wind at a particular site may come from any direction between due east 
(6 = 0°} and due north (0 = 90°). All values of wind speed V are possible. 
(a) Sketch the sample space for wind speed and direction. 
(b) ic — (V » 20 mph) 
= {12 mph < V <30 mph) . 
= {0 E 30 2 


48m 


Figure P2.3 


62 ` BASIC PROBABILITY CONCEPTS - 


Identify the events A, B, C, and a in the sample space sketched in 


part (a). 
(c) Use new sketches to identify the following events: 
D=ANC 
E=AvuUB 


F—-AnBncCc 


(d) Are the events D and E mutually exclusive? How about events A 
and C? 


2.6 The possible values of the water height H, relative to mean water level, at 
each of the two rivers A and B are as follows (in meters): 


H = —3, —2, —1, 0, 1,2,3,6 
(a) Consider river A and define the following events: 
A ={H4>0}, 4 -(H,—-0, A = {Hy $0} 


` List ali pairs of mutually exclusive events among Aj, Az, and Ag. 
(b) At each river, define 

Normal water, N ={-1 < H <1} 

Drought, D = {H < ~1} 

Flood, F = {H > lj i 
Usé the ordered pair (4, hg) to identify sample points relating to joint 
water levels in 4 and B, respectively; thus (3, —1) defines the condition 
hy =3 and hy = —1 simultaneously. Determine the sample points for the 
events 

GQ Na ONg (i) (Fy Y Dg) ONY 


2.7 The sequence of main activities in the construction of two structures is 
shown in Fig. P2.7. The construction of the superstructures A and B can 
start as soon as their common foundation has been completed. 

The possible times of completion for each phase of construction are 
indicated in Fig. P2.7; for example, the foundation phase may take 5 or 7 
inonths. 

(a) List the possible combinations of times for each phase of the project; 

for example, (5; 3, 6) denotes the event that it takes 5 months for founda- 


` tion, 3 months for superstructure 4, and 6 months for superstructure B. `- 


(b) What are the possible rotal.completion times for. structure 4 alone? 
Far structure B alone? i 


Foundation. 


Start Finish 


Outflow, 


Figure P2.7 t Figure P2.8 


PROBLEMS 63 


(c) What are the possible total completion times for the project? . ` 
(d) If the possibilities in part (a) are equally likely, what is the probability 
that the complete project will be finished within 10 months? 


28 A cylindrical tank is used to store water for a town (Fig. P2.8). The available 
supply is not completely predictable. In any one day, the inflow is equally 
likely to fill 6, 7, or 8 ft of the tank. The demand for water is also variable, 
and may (with equal likelihood) require an amount equivalent to 5, 6, or 7 ft 
of water in the tank. i 

(a) What are the possible combinations of inflow and outflow in a day? 

(b) Assuming that the water level in the tank is 7 ft at the start of a day,. 
what are the possible water levels in the tank at the end of the day? 
What is the probability that there will be at least 9 ft of water remaining - 
in the tank at the end of the day? 


2.9 A power plant has two generating units, numbered 1 and 2. Because of 
maintenance and occasional machine malfunctions, the probabilities that, 
in a given week, units No. 1 and 2 will be out of service (these two events are 
denoted by E, and £;) are 0.01 and 0.02, respectively. 

During a summer week there is a probability of 0.10 that the weather will 
be extremely hot (say average temperature 7 85?F; this event is denoted 
by H)so that dernand for power for air- conditioning will increase consider- 


ably. The performance of the power plant in terms of its potential ability : 


to meet the demand ii in a given week can be classified as 


(i) satisfactory S, if both units are furictioning and the average tem- 
perature is below 85°F 
(ii) poor P, if one of the units is out of service and the average 
- temperature is above 85°F 
(ii) marginal M, otherwise. 


Assume H, E,, and E, are statistically independent. 
(a) Define the events S, P, and M in terms of H, E,, and Ep.. 
(by What is the probability that exactly one unit will be out of service in’ 
any given week? 
(©) Find P(S), PP), and P(M). 
Section 2.3 


2.10" A cantilever beam has 2 hooks where weights (1) and Q) may be hung (Fig. 
P2,10). There can be as many as two weights or no weight at each hook. In 
order to design this beam, the engineer needs to know the fixed-end moment 
at A, that is, My. 

(a) What are all the possible values of M4? 
(b) Let ` 
£, denote the event that M, > 600 ft-lb 


E, denote the event that 200 < M, < 800 ft-lb 
Are events E, and. E, mutually exclusive? Why? - 

(c) Are events Ej and E; mutually exclusive? Where E; = (0, 100, 400}. 

(d) With the following information: 
Probability that weight () hangs at B = 0.2 
Probability that weight (]) hangs at C = 0.7 
Probability that weight @ hangs at B = 0.3- 
Probability that weight © hangs at.C = 0.5 


64 BASIC PROBABILITY CONCEPTS 


ott 10 ft 


Figure P2.10 


What are the probabilities associated with each sample point in part (a)? 
Assume that the location of weight (T) does not affect the probability 
of the location of weight (2). 

(e) Determine the probabilities of the following events: 


Ey, Ej, Ey N Ex, Ey V Eo, E, 


211 Ina building construction project, the completion of the building requires 
the successive completion of a series of activities. Define 


` E = excavation completed on time; and P(E) = 0.8 
F = foundation completed on time; and P(F) = 
s = superstructure coinpleted on time; and P(S) = 0.9 


Assume statistical independence among these events. 

(a) Define the event (project completed on time) in terms of E, F, and S, 
Compute the probability of on-time completion. 

'(b) Define, in terms of £, F, S and their complements, the following event: 
G = excavation will be on time and at least one of the other two 
: operations will not be on time 
“Calculate P(G). 

(c) Define the event 
H = only one of the three operations will be on time 


242 The waste from an industrial plant is subjected to treatment before it is 
ejected to a nearby stream. The treatment process consists of three stages, 
namely: primary, secondary, and tertiary treatments (Fig. P2.12). The 

` primary treatment may be rated'as good (G,), incomplete (/,) or failure (Fy). 
The-sécondary treatment may be rated as good (Gz) or failure (F,), and the 
tertiary treatment may also: be rated as good (G3) or failure (F3). Assume 
that the ratings in each treatment are equally likely (for example, the primary 
treatment will be equally likely to be good or incomplete or failure). Further- 
more, the performances of the three stages of treatment are statistically inde- 
pendent of one another, 

(a) What are the possible combined ratings of the three treatment stages? 
(for example, G;, F;, G4 denotes à combination where there is a good 
primary and tertiary, but a failure in the secondary treatment). What is 
the probability of each of these combinations (or sample points)? 

(b) Suppose the event of satisfactory overall treatment requires at least 
two stáges of good treatment, What is the probability of this event? 

(c) Suppose: 

= good primary treatment , 
= good secondary treatment 
Es = good tertiary treatment 


PROBLEMS 65 
Determine : j 
P(E), P(E, Y E),  P(EE, 


(d) Express in terms of E,, E», Es the event of satisfactory overall treatment i 
as defined i in part (b). (Hint. E,E, is part of this event.) 


$ 
E 
E 
È 
Tertiary 
(G,F) 
Figure P2.12 Figure P2.13 


243 The cross-sections of the rivers at A, B, and C are shown in Fig. P2.13 and 
the flood levels at 4 and B, above mean flow level, are as follows: 


Flood level at 4 


t) > Probability 
0 0.25 - 
2 0.25 
4 0.25 
6 0.25 
Flood level at B 
(ft) Probability 
0 020 .. i 
2 . 0.20 
4 : 0.20 
6 ^| 7920. 
8 


0.20 


Assume that the flow velocities at A, B, and C are the same. What is the 
probability that the flood at C will be higher than 6 ft above the mean level? 
Assume statistical independence between flood levels at A and B. . Ans. 0.3. 


2. 14 Figure P2.14 isa plot of test results showing the degree of subgrade compac- 


L|  TofolNumber Of Data Points, n = 100 


o 
£40 Ie 
$ o 
i o 
~ * 7 
t 
E 
E 
$ 
ES 
3- 
€ go 
a 
acd S/O 5 y 
95 50 Qo. € 
Degree Of Compaction, % CBR 
Figure P2.15 


Figure P2.14 


tion C veisus the life of pavement L. Determine the following: 
(a) PQO < L «40| C > 70) 
( PUL > 40| C «95. 
©) P(L > 40|70 < C < 95) 
(d) P(L > 30 and C < 70) 

2,5 The highway system between cities Æ and B is shown in Fig, P2.15. Travel 
between A and B during the winter months is not always possible because 
some parts of the highway may not be open to iraffic, because of extreme 
weather condition, Let .E;, Ez, E; denote the events that highway AB, AC, 
and CB are open, respectively. 

On any given day, assume 
P(E)-2/5. PEs] E) = 4/5 
P(Eq) = 3/4. °° P(E; | EE) = V2. 
. P(Es) = 2/3 
(a) What is the probability that a traveler will be able to make a trip from 
A to B if he has to pass through city C? Ans. 0.6. : 


: (b) What is the probability that he will be able to get to city B? Ans. 0.7. - 


getting to B? . : : 
2.46 A contractor is submitting bids to two jobs A and B. The probability that he 
will win job A is P(A) 1 and that for job B is P(B) = $ . 
(a) Assuming that winning job A and winning job Bare independent events, 
what is the probability that the contractor will get at least a job? 
(b) What is the probability that the contractor got job A if he has won at 
. leastonejob? - m . ` n oo, 
(c) If he is also submitting a bid for job C with probability of winning it 
P(C) — 1/4, what is the probability. that he will get at least one job? 
Again assume statistical independence among 4, B,'and C. What is 
the probability that the contractor will not get any job at all? . - 


2.17 Cities 1 and ¥ are connected by route A, and route B connects cities 2 and 3 


(©) Which route should he try first in order to maximize his chance of 


2.18 


CRUD LENS [74 


foe m 


Figure P2.17 


(Fig. P2.17). Denote the eastbound lanes as 4, and B, and the westbound 
lanes as Ap and Bp, respectively. ` e 

Suppose that the probability is 90% that a lane in route A4 will not require 
major repair work for at least 2 years; the corresponding probability for a 
lane in route B is only 80%. . ` 

(a) Determine the probability that route A4 will require major repair work 
in the next.two.years. Do the same for route B. 

Assume that if one lane of a route needs repair, the chances that the 
‘other Jane will also need repair is 3 times its original probability. 
Ans. 0.17; 0.28. —— 

(b) Assuming that the need for repair works in routes A and B are inde- 
pendent of each other, what is the probability that the road between 
cities 1 and 3 will require major repair in two years? Ans. 0.40. 

The water supply system for a city consists of a storage tank and a pipe line 
supplying water:from a reservoir some distance away (Fig. P2.18). The 
amount of water available from the reservoir is variable depending on the 
precipitation in the watershed (among other things). Consequently, the 
amount of water stored in the tank would be alsó variable. The consumption 
of water also fluctuates considerably. ! 


To simplify the problem, denote 
As available water supply from the reservoir is low 
B = water stored in the tank is low 
C = level of consumption is low 


and assume that 


P(A) = 20% 
. PB) = 15% 
P(C) = 50% 


The reservoir supply is regulated to a certain extent to meet the demand, so 


. Figure P2.18 


68 


2.19 


BASIC PROBABILITY CONCEPTS 


that : MG 
P(A | C) = P(reservoir supply is high | consumption is high) 
=75% ; l 


Also, P(B | A) = 50%, whereas the amount of water stored is independent of 
the demand. 

Suppose that a water shortage will occur when there is high demand (or 
consumption) for water but either the reservoir supply is low or the stored 
water is low. What is then the probability of a water Shortage? Assume that 
P(AB | Č) = 0.5 P(AB). . ; 
The time T (in minutes) that it takes to load crushed rocks from a qui 
onto a truck varies considerably. From a record of 48 loadings, the following 
were observed, E 


Loading time T 


(minutes) No. of observations 
0to17 0- 
I to 2— 5 
2 to 37 ^ 12 
3to4- 15 
4to5- 10 
5 to 67 6 
26 0 


Total — 48 


(a) Sketch the histogram for the above data.  . 

(b) Based on these data, what is thé probability that the loading time 7 
for a truck will be at least 4 minutes? : 

(c) What is the probability that the total time for loading 2 consecutive 
trucks will be less than 6 minutes? Assume the loading times for any 
two trucks to be statistically independent. ' 

(d) In order to make a conservative estimate of the loading time, it. is 
assumed that loading a truck will require at least 3 minutes; on this 


assumption, what will be the probability that the loading time for a. 


truck will be less than 4 minutes? 


2.20 ^A gravity retaining wall may fail either by sliding (A) or overturning (B) or 


2.21 


both (Fig. P2.20). Assume: 
(i) Probability of failure by sliding is twice as likely as that by overturning; 
that is, P(A) = 2P(B). i 
Gi) Probability that the wall also fails by sliding, given that it has failed by 
overturning, P(A ] B) 208 j ; 
(iii) Probability of failure of wall = 10-3 
(a) Determine the probability that sliding will.occur. Ans. 0.00091. 
(b) If the wall fails, what is the probability that only sliding has occurred? 
Ans. 0.546. . 
Two cables aré used to lift a load W (Fig. P2.21). However, normally only 
cable A will be carrying the load; cable B is slightly longer than 4, so nor- 


Mikro oe 


Al \B 
Sliding 
~_— 
Figure P2.20 Rock Figure P2.21 


222 


233 


maily it does not participate in carrying the load. But if cable 4 breaks, 
then B will have to carry the full load, until A is replaced. 
The probability that 4 will break is 0.02; also, the probability that B 

will fail if it has to carry the load by itself is 0.30. ; 

(a) What is the probability that both cables will fail? " 

(b) If the load remains lifted, what is the probability that none of the cables ` 

‘have failed? ` ' 

The preliminary design of a bridge spanning a river consists of four. girders 
and three piers as shown in Fig. P2:22. From consideration of the loading and 
resisting capacities of each structural element the failure probability for each 
girder is 1075 and each pier is 10-5. Assume that failures of the girders and 
piers are statistically independent. Determine: : E 


(a) The probability of failure in the girder(s). 
(b) The probability of failure in the pier(s). 
(c) The probability of failure of thé bridge system. 


Fi igure P2.22 


The town shown in Fig. P2.23 is protected from floods by a reservoir dam 
that is designed for a 50-year flood; that is, the probability that the reservoir 
will overflow in a year is 1/50 or 0.02. The town and reservoir are located in an 
active seismic region; annually, the probability of occurrence of a destructive 
earthquake is 5%. During such an earthquake, it is 20% probable that the 
dam will be damaged, thus causing the reservoir water to flood the town, 
Assume that the occurrences of natural floods and earthquakes are statis- 
tically independent. . 
(a) What is the probability of an earthquate-induced flood in a year? 
(b) What is the probability that the town is free from flooding in any one 
ear? 2 i : 
(c) If the occurrence of an earthquake is assumed in a given year, what is 
the probability that the town will be flooded that year? ` 


70 BASIC PROBABILITY CONCEPTS 


Dam 
"4 Reservoir 


Town R i n 


Figure P2.23 


2.4 From a survey of 1000 water-pipe systems in the United States, 15 of them are 
reported to be contaminated by bacteria alone whereas 5 are reported to have 
an excessive level of lead concentration and among these 5, there are 2 that 
are found to contain bacteria also. 

(a) What is the. probability that a pipe system selected at random will 
contain bacteria? Ans. 0.017. 

. (b) What is the probability that a pipe system selected at random is con- 

taminated? Ans. 0.02. 

(c) Suppose that a pipe system is found to contain bacteria. What is the 
probability that its lead concentration is also excessive? Ans. 2/17. 

(d) Assume that the present probability of contamination as computed in 
part (b) is not satisfactory, and it is proposed that it should not exceed 
0,01. Suppose that it is difficult to control the lead contamination, but 
it is possible to reduce the likelihood of bacteria contamination. What 
Should be the permissible probability of bacteria contamination? 
Assume that the value of the conditional probability computed in 
part (c) still applies. - Azs. 0.00567. ` 


2.25 The structural component shown in Fig. P2.25 has welds to be inspected for 
flaws. From experience, the likelihood of detecting flaws in a foot of weld 
provided by the manufacturer is 0.1; and the probability of detecting flaws 
in a weld of length L ft is given by 


P(F)-01L for OxXL«2ft 


In general, the quality between sections of welds in a structural component is 


EX ett, 


Figure P2.25 Figure P2.26 


O-+— Branch: Pipes ———— 9 - ievotion. 


PROBLEMS 71 


correlated. Assume the following: A : 
G) If flaws are detected in section 44, the probability of flaws being detected 
in A, will be three times its original probability. 
(ii) If flaws are detected in section A, the probability of detecting flaws in 
section B will be doubled, 5 
Let Fy,, F4,, F4, and Fp be the events of flaws detected in weld sections 
Ay, Ag, A, and B, respectively. 
(a) What is the probability of detecting flaws in A? Ans. 0.28. 
(b) What is the probability of detecting flaws in the structural component? 
Ans. 0.324. 
(c) If flaws are detected in the structural component, what is the probability 
that they are found only in A? Ans. 0.692. 


2.26 The storm drainage in a residential subdivision can be divided into watershed 
areas N; and N; as shown in Fig. P2.26. The drainage system consists of the 
main sewers with capacities C} = 100 cfm (cubic feet per minute) and C, = 
300 cfm, respectively. The amounts of drainage from Nj and N; are variable, 
depending on the rainfall intensities in the subdivision (assumé that whenever 
it rains the entire subdivision is covered); in any given year, the maximum 
flow, J and Jp, and their corresponding probabilities are as follows. 


I, (cfm) Probability I (cfm) Probability 
80 0.60 100 . 0.50 
120 0.40 210 0.30 
250 0.20 


Neglect the possibility of flooding in N, caused by the overflow of pipe 


xd 

(a) What is.the probability of flooding in area Ny? Flooding occurs only 
when the drainage exceeds the capacity of the main sewer. 

(b) What is the probability of flooding in area N3? i 

(c) What is the probability of flooding in the subdivision? 


2.27 In order to study the parking problem of a college campus, an average worker 
in office building D, say Mr. X, is selected and his chance of getting a parking 
space each day is studied. (Assume that Mr. X will check the parking lots 
A, B, C in that sequence and will park his car as soon as a space is found.) 
Assume that there are only three parking lots available, of which A and B 
are free, whereas C is metered (Fig. P2.27). No other parking facilities (say 
street parking) are allowed. From statistical data, the probabilities of getting 
a parking space each week day morning in lots A, B, C are 0.2, 0.1, 0.5, 
respectively. However, if lot A is full, the probability that Mr. X will finda - 
space in B is only 0.04. Also, if lots A and B are full, Mr. X will only have a 
probability of 40% of getting a parking space in C. Determine the following: 

(a) The probability that Mr. X will not be able to secure a free space on a 
weekday morning. Ans. 0.768. 

(b) The probability that Mr. X will be able to park his car on a weekday 
morning. Aas. 0.539. 

(c) If Mr. X has successfully parked his car one morning, what is the proba- 
bility that it will be free of charge? Ans. 0.43. 


72 BASIC PROBABILITY CONCEPTS 


Office Bldg, D 


Figure P2.27 


2.28 Pollution is becoming a problem in cities T' and IT. City I is affected by both 
air and water pollution, whereas city H is subjected to air pollution only. 
A three-year plan has been put into action to control these sources of pollu- 
tion in both cities. It is estimated that the air pollution in city I will be success- 

, fully controlled is 4 times as likely as that in city II. However, if air pollution 
_ in city IL is controlled, then air pollution in city I will be controlled with 90% 
probability. i 
The control of water pollution in city may be assumed to be independent 
of the control of air pollution in both cities. In city I, the probability that 
pollution will be completely controlled (that is, both sources are controlled) 
is 0.32, whereas it is also estimated that water pollution is only half as likely 
to be controlled as the air pollution in that-city. Let 


Ay be the event “air pollution in city Iis controlled" 
Aj, be the event “air pollution in city II is controlled”. 
Wi be the event “water pollution in city I js controlled" 
Determine: ` 
(a) Probability that air pollution will be controlled in both cities. Ans. 0.18. 
(b) Probability that pollution in both cities will be completely controlled. 
Ans. 0.072. 
(c) Probability that at least one city. will be free of pollution. Ans: 0.448. 
2.29 A form of transportation is to be provided between two cities that are 200. 
miles apart, The alternatives are highway (H), railway (R), or air transport 


E - jon | =| 


Figure P2.29 


PROBLEMS 73 


(A); the last one meaning the construction of airports in both cities. (See 
Fig. P2.29.) Because of the relative merits and costs, the odds that a Com- 
mittee of Planners will decide on R, H, or A are 1 to 2 to 3, Only one of these 
three means can be constructed. A 
However, if the committee decides on building a railroad (R), the prob- : 
ability that it will be completed in one year is 50%; if it decides on a highway 
(H), the corresponding probability is 7575; and if it decides on air travel, 
there is a probability of 907; that the airports will be completed in one year. 
(a) What is the probability that the two cities will have a means of trans- 
portation in one year? > 
(b) If some transportation facility between the two'cities is completed in 
one year, what is the probability that it will be air transport (A? 
(c) If the committee decides in favor of land facilities, what is the probability 
that the final decision will be for a highway (Hy. 


2.30 “Liquefaction of sand" denotes a phenomenon in foundation engineering, 
in which a mass of saturated sand suddenly loses its bearing capacity because of 
rapid changes in loading conditions—for example, resulting from earthquake 
vibrations. When this happens, disastrous effects on structures built on the 
site may follow. : 

For simplicity, rate earthquake intensities into low (L), medium (M), 
and' high (H). The likelihoods of liquefaction associated with earthquakes 
of these intensities are, respectively, 0.05, 0.20, and 0.90. 

Assume that the relative frequencies of occurrence of earthquakes of these 
intensities are, respectively, 1, 0.1, and 0.01 per year. : 

(a) What is the probability that the next earthquake is of low intensity? 
. Ans. 0.9. 5 d 
(b) What is the probability of liquefaction of sand at the site during the 
next earthquake? Ans. 0.07. : 
- (c) "What is the probability that the sand will survive the next three earth- 
quakes (that is, no liquefaction)? Assume the conditions between 
earthquakes are statistically independent. Ans. 0.80. 


2.34 There are three modes of transporting material from New York to Florida, 
namely, by land; sea, or air. Also land transportation may be by rail or 
highway. About half of the materials are transported by land, 30% by sea, 
and the rest by air. 

Also, 40% of all land transportation is by highway and the rest by rail 
shipments. The percentages: of damaged cargo are, respectively, 10% by 
, highway, 5% by rail, 6% by sea, and 2% by air. 
(a) What percentage of all cargoes may be expected to be damaged? 
(b) 1f a damaged cargo is received, what is the probability that it was 
shipped by land? By sea? By air? 

2.32 The amount of stored water in a reservoir (Fig. P2.32a) may be idealized into 
three states: full (F), half-full (H), and empty (E). Because of the probabilistic 
nature of the inflowing water into the reservoir, as well as the outflow from 
the reservoir to meet uncertain demand for water, the amount of water 
stored may shift from one state to another during each season. Suppose that 
these transitional probabilities from one state to another are as indicated in 
Fig. P2.32b. For example, in the beginning of'a season, if the water storage 
is empty, the probability that it will become half-full at the end of the season 
is 0.5 and the probability that it will remain empty is 0.4, and so on, Assume 
that the water level is full at the start of the season. 


T cM a At rri ARIA nia a UULIULE X 


Half -Full 


Figure P2.32a J . Figure P2.32b 


(a) What is the probability that the reservoir will be full at the-end of one 
season? What is the probability that the reservoir will contain water 
at the end of one season? Ans, 0.2; 0.9. 

(b) What is the probability that the reservoir will be full at the end-of the 
second season? Ans. 0.33. B 

(c) What is the probability that the reservoir will'contain water at the end of 
the second season? Ans. 0.73. 


2.33 Ata quarry, the time required to load crushed rocks onto a truck is equally 
likely to be either 2 or 3 minutes (Fig. P2.33). Also the number of trucks 
in a queue waiting to be loaded at any time varies considerably, as reflected 
in the following set of 30 observations taken at random. The time required to 


No. of trucks. No. of Relative 
in queue’ _ observations — , — frequency 
0 6 2 
1 3° 
2 9 
3 9 
4 3 
5 0 


load a truck is statistically independent of the queue size. : 
(ay If there are two trucks in the queue when a truck arrives at the quarry, 
What is the probability that its “waiting time” will be less than 5 minutes? 
- Ans, 0.25. ` dit SS Wer PPS 


Quorry 


Oo oOo 


Queue Size 


Figure P2.33 


PROBLEMS 75 


(b) Before arriving at the quarry (and thus not knowing the size of the 
queue), what is the probability that the waiting time of a particular 
truck will be less than 5 minutes? Ans. 0.375. 

2.34 A chemical plant produces a variety of products using four different proc- 
‘esses; the available labor is-sufficient only to run one process at a time. The 
plant manager knows that the discharge of dangerous pollution into the 
plant waste water system and thence into a nearby stream is dependent on 
which process equipment is in operation. The probability that a particular 
process will be producing dangerous pollution products is as shown below: 


process 4 40% 
process B 5% 
proces C 30% 
process D 10% 


All other processes in the plant are considered harmless. 
In a typical month the relative likelihoods of processes A, B, C, and D 
operating through the montli are 2:4:3:1, respectively. 
(a) What is the probability that there will be:zo dangerous pollution dis- 
` charged in a given month? : k : 
(b) If dangerous pollution is detected in the plant discharge, what is the 
probability that process A was operating? 1 
(©) The pollution products that are discharged by the various processes 
have different probabilities of producing a fish Kill in the stream that the 
plant uses for disposal, as follows. . 


„ Process Probability of fish kil “< 


bawa, 
e 
oo 


Based on these assumptions what is the probability that fish will .be 
killed by pollution in the stream in a given month? : 

(d) Of the four processes, which is the most fruitful one (in terms of mini- 
mizing the likelihood of fish kill) to select for clean-up if only one can 
be improved? : NS ! 


a 2.35 The probability of occurrence of fire in a subdivision has been estimated to be 


30% for one occurrence and 10% for two occurrences in a year. Assume that © 
the chance for three or more occurrences is negligible. In a fire, the probability 
- that it will cause structural damage is 0.2. Assume that structural damages 
between fires are statistically independent. 
(a) What is the probability that there will ‘be no structural damage caused 
: byfire in a year? | Ans. 0.904. f EM : 
(b) If a small town consists of two such subdivisions, what is the probability ' 
that there will be some strüctural damage caused by fire in the town in ` 


2.36 


2.37 


2.38 


ee Ban UR AG 


a year? Assume that the events of fire-induced structural damage in the 
two subdivisions are statistically independent. Ans. 0.183. 


At a construction project, the'amount of material (say lumber for falsework) 
available for any day is variable, and can be described with the frequency 
diagram of Fig. P2.36. The amount of material used in a day's construction 
is either 150 units or 250 units, with corresponding probabilities 0.70 and 
0.30. 

(a) What is the probability of shortage of material in any day? Shortage 
occurs whenever the available material is less than the amount needed 
for that day’ s construction. 

(b) If there is a shortage of material, what is the probability that there 
were fewer than 200 units available? 


‘Se Per Unit 


Ax Amount Of Moterial 
Avolloble 


D 0 150 200 250 300 


Figure P2.36 Frequency diagram ofA /— 
The completion time of a construction project depends . on whether the 
carpenters and plumbers working on the project will go on strike. The 
probabilities of delay (D) are 100%, 80%, 40%, and 5% if both go on strike, 
carpenters alone go on strike, plumbers alone go on strike, and neither of them 


strikes, respectively. Also, there is 60% chance that plumbers will strike if 
carpenters strike, and if plumbers go on ‘strike there is 30% chance that 


«carpenters would follow. It is known that the chance for the plumbers’ 


strike is 10%. Let 
C = event that carpenters wént on strike 
P = event that plumbers went on strike 
D = delay in project completion 
(a) Determine probability of delay in completion. Ans. 0.118. 
(b) If there is a delay in completion, determine the following: 
(i) Probability that both carpenters and plumbers strike. Ans. 0.254. 


(ii) Probability that carpenters strike and plumbers donot. Ans. 0.136. 
(iii) Probability of carpenters’ strike. Ans. 0.390. 


The water supply for a city comes from two reservoirs, a and b (Fig. P2.38). 


Because of variable rainfall conditions each year, the amount of-water in ` 


each reservoir may exceed or not exceed the normal capacity. Let A denote 
the event that the water in reservoir a exceeds its normal capacity, and let B 
denote that for reservoir b. The following probabilities are given: P(B) = 0.8, 


' P(AB) = 0.6, P(A | B) — 0.7. In addition, -the probabilities that-the city : 


will have satisfactory supply of water if only one reservoir exceeds, both 
reservoirs exceed, and none of the reservoir exceeds the normal capacities are 


2.39 


2.40 


241 


PROBLEMS 77 


Reservolr A 
Reservoir B L3 


Figure P2.38 


0. 7, 0.9, and 0.3, respectively. What i is the probability that the city will have 
satisfactory water supply?: Ans. 0.764. - 

A water tower is located in ani active earthquake region. When:an earthquake: 
occurs, the probability that the tower will fail depends on the magnitude of 
the earthquake ana ao on the amount of storage in the tank at the time of 
shaking of the groùnd.' For simplicity; assume that the tank is either full 
(F) or half-full (H) with relative likelihoods of 1 to 3. The earthquake mag- 
nitude may be assumed to be either strong ($) or weak (W) with relative 
frequencies 1 to 9. 

When a strong earthquake occurs, the tower will definitely collapse | 
regardless of the storage level. However, the tower will certainly survive a 
weak earthquake if the tank is only half-full. If the tank is full during a weak 
earthquake, it will have a 50-50 chance of survival. i 

If the tower collapsed during a recent earthquake, what i is the probability 
that the tank was full at the time of the earthquake? 


For a county in Texas, the probabilities that it will be hit by one or two hurri- 
canes each year are 0.3 and 0.05, respectively. The évent that it will be hit 
by three or more hurricanes in a . year ‘may be assumed to have negligible’ 
probability. .. 
- This county may be subjected to floods each year from the melting of snow 
in the upstream regions, or from the heavy precipitation brought by hurri- 
canes, or both. Normally, the chance of flood in a year, caused by the melting 
snow only, is 10%. However, during a hurricane there is a 25% probability 
of flooding. Assume that floods caused by the melting snow and floods 
caused by hurricanes are independent events. 

What is the probability that there will.be flooding in this. county in a year? 


Before the design of a tunnel through a rocky region, geological explóration 
was conducted to investigate the joints and the potential slip surfaces that 
exist in the rock strata (Fig. P2.41). For economic reasons, only portions of . 
the strata are explored. Jn addition the measurements recorded by the instru- 
ments are not perfectly reliable. Thus the geologist can only conclude that the 
condition of the rock may be either highly fissured (H), medium fissured (M), 


_ Or slightly fissured (L) with relative likelihoods of 1: IE 8. Based on this: 


m AIEO RORAXZISZARPRGOR OR UULTURA R42 


Figure P2.41 


information, the engineer designs the tunnel and estimates that if the rock 
condition is L, the reliability of the proposed design is 99.97%. However, 
if it turns out that the rock condition is M, the probability of failure will be 
doubled; similarly, if the rock condition is H, the probability of failure will 
be.10 times that for condition L. . ` 

(a) What is the expected reliability of the proposed tunnel design? Ans. 
0.998. - 

(b) A more reliabie device is subsequently used to improve the prediction 
of rock condition. Its results indicate that a highly fissured condition for 
the rock around the tunnel is practically impossible, but it cannot give 
better information on the relative likelihood between rock conditions 
M and L. Tn light of this new information, what would be the revised 
reliability of the proposed tunnel design? Ans. 0.9989. 

: (c) If the tunnel collapsed, what should be the updated probabilities of M 

v ‘and L? Ans. 0.797; 0.203. ] 5. 

2.42 "Three research and development groups, A,B, and C, submitted. proposals 
for a research project to be awarded by a research agency of the government. 
From past performance records, the respective histograms of completion 
time relative to the scheduled target time ft) are shown in Fig. P2.42. It.is 
known that groups 4 and B have about equal chances of getting the project, 
whereas iC is twice as likely as either A or B to win the contract. 

Based on past performance records, determine: . 

(a) The probability that the project will be completed on schedule. Ans. 

0.60. . 


Histogram Of Project 
Histogrom Of Project Completion For C , 
Completion For A-and B 60 > 


o 
oO 
o 
io] 


38 


pi 
-O 
m 
o 


Fraction Of Projects, % 
Fraction Of Projects, % 
è 


o 


.. 0608 01214 16 t/t, | 
Figure P2.42: ` 


PRUDLEIRD iz 


(b) If the project completion is delayed, what is the probability that it was 
originally awarded to C? . Ans. 0.25. 


243 Two independent remote sensing devices, 4 and B, mounted on an airplane. 
are used to determine the locations of diseased trees in a large area of forest 
land. The detectability of device A is 0.8 (that is, the probability that a group 
of diseased trees will be detected by device A is 0.8), whereas the detectability 
of device B is 0.9. Us 

However, when'a group of diseased trees has been detected its location 
may not be pinpointed accurately by either device. Based on a detection 
from device A alone, the location can be accurately determined with prob- 
ability 0.7, whereas the corresponding probability with device B alone is 
only 0.4. If the same group of diseased trees is detected by both devices, its 
location can be pinpointed with certainty. Determine the following. 

(a) The probability that a group of diseased trees will be detected. Ans. 
0.98. i E 

(b) The probability that a group of diseased trees will be detected by only 
one device. Ans. 0.26. ` ` 

(c) The probability of accurately locating a group of diseased trees, Ans, 

0.848. ` i 


3. Analytical Models of 
Random Phenomena 


3.1. RANDOM VARIABLES 


In engineering and the physical sciences many ‘random phenomena of in- 
terest are associated with the numerical outcomes of some physical quantity. 


In the various examples discussed earlier, we were concerned with the num- 


ber of bulldozers operative after six months, the time required to complete a 
project, and the flood of a river above mean flow level, all of which are 
outcomes in numerical terms. However, we also saw examples in which the 
outcomes are not in numerical terms—for example, the state of completion 
-of a project in one year, the survival or failure of a chain, and the availa- 
bility of different modes of transportation. Events of this latter type may 
also be identified numerically by artificially assigning numerical values to 
each of the possible alternative events; for example, the three states of 
completion of a project in one year (definitely completed, completion ques- 


tionable, and definitely incomplete) may be arbitrarily assigned the numbers * 


1, 2, and 8, respectively. 
In other words, the possible outcomes of a random phenomenon can be 
identified numerically, either naturally or artificially. In any case, an out- 
come or event may be identified through the value(s) of a function; such a 
function is a random variable, which is usually denoted with a capital letter. 
The value (or range of values) of a random variable then represents a 
distinct event; for example, if the values of X represent floods above mean 
level, then X > 7 ft stands for the occurrence of a flood higher than 7 ft, 
and (referring to the example above) if Y is the state of completion of a 
project in one year, then Y = 2 means that the project’s completion is 
questionable in one year. In short, a random variable is a device (cooked 
up when necessary) to identify events in numerical terms. Henceforth, we 
can then say that (X = a), or (X € b), or (a < X € b) is an event. 
More formally, a random variable may be considered as a rule that maps 
events in a sample space into the real line. The mapping is one-to-one; also, 
mutually exelusive events are mapped into nonoverlapping intervals on the 
; real line, In Fig. 3:1 the events E;, E», and so on, from the sample space S 


“80 


i Somple Space S 


Random Varlable X 


' Reo! Line x 


` Fi igure 3.1 Mapping of events into real line through random variable X 


are mapped into the real line through the random variable X; these events . 
ean then be identified as follows: 


E, = (a@<X <b) 
Ej-(c«X&d) 
EU, = (X, €a)U(X >d); EE = (ec << X <b) 


Consistent with the underlying sample space, a random variable may be 


discrete or continuous. 

The purpose and advantages of identifying events in numerical terms 
should be obvious—this will then permit convenient analytical description 
as well as graphical display of events and their probabilities. ` iN 


3.1.1. Probability distribution of a random variable 


Since the value of a random variable represents an event, it. can assume a 
numerical value only with an associated probability or probability meas- 
ure. 'The rule for deseribing the probability measures associated with all 


the values of a random variable is a probability distribution or “probability | - 


law.” 
If X is a random variable, its probability distribution can always be 
described by its cumulative distribution function. (CDF), which i is 


F(z) m P(X <æ) forall at (3.1). 


Here X is a discrete random variable if only certain discrete values of z have 
positive probabilities. Alternatively, X is a continuous random variable if 
probability measures are defined for any value of x. A random variable may 


* Á standard notation is to denote a random variable with a capital letter, and its value i 
with the corresponding lowercase letter. 


82 ANALYTICAL MODELS OF RANDOM PHENOMENA 
also be both discrete and continuous; an example of such a mixed random 
variable is shown in Fig. 3.2c. 

For a discrete random variable X, its probability distribution may also 
be described in terms of a probability mass function-(PM.F) , which is simply 
a function expressing P(X = z) for all z. Therefore, if X is a discrete ran- 
dom variable with PMF px(z;) = P(X = z;), its distribution function is 


Fx(2) =P(X€2)= Y P(K=a) = X p(w) (82) 


all zi <7 all zi <z 


However, if X is continuous, probabilities are associated with intervals 
on the real line (since events are defined as intervals on the real line); 
consequently, at a specific value of X, such as X = 2, only the density of 
probability is defined. Thus, for à continuous random variable, the proba- 
bility law may also be described in terms of a probability density function 


(PDF), so that if fx (z) i is the PDF of X, the probability of X in the interval 


(a, b] is 
b 
Pla<X<b) = Í fs) de (3.3) 


follows then that the corresponding distribution function i is 


A 
È 5 


Adda. if F(s) has, a fest derivative, then, fro Eat 2. 4, 


ma 


-= diio 4 2509 ^ (8.5) 


"We igit reiterate that fx(x) is not a ‘probability; however, fx (ay dz = 
P(è < X'S x + dz) is the probability that values of X will be in the inter- ` 


val- (x, £ + dx]. ; 
It should be emphasized that any function used to represent the probabil- 
ity distribution of a random variable must necessarily satisfy the axioms of 


probability (see Section 2.3.1). For this reason, the function must be non- . 


negative and the probabilities associated with.all possible values of the 
random variable must add up to 1.0..In other words, if Fx(z) is the distri- 
bution function of X, then it must have 'the following properties: 


(a). Fx(—9) 20; Fx(to) —10 


(b) Fx (x) > 0, and is nondecreasing with z. 


i 


(e) Tt is continuous with z. 


Conversely, any function possessing these properties i is a bona fide cumula- 


Fal) = POC < a) = JE 20 04 


Plx) or fx) 


pylxi) 


Mixed Distribution 


Discrete X 


Continuous X 


(c) 


. (b) 


(o) 


Figure 3.2 Bona fide probability distributions 


83 


84 ANALYTICAL MUDELS UF KANDOM PHENOMENA 


tive distribution function. By virtue of these properties and Eqs. 3.2 through 
3.5, the PMF and PDF are nonnegative functions of x, whereas the proba- 
bilities of a PMF add up to 1.0, and the total area under a PDF is also equal 
to 1.0. Figure 3.2 presents graphic examples: of legitimate probability 


_ distributions. Figure 3.2 also illustrates the graphical characteristics of 


the probability distributions of discrete, continuous, and mixed random 
variables. 
We observe that we can write Eq. 3.3 as 


P(a «X & b) =f roa- f fx(z) da 


Similarly, for discrete X. ,we have 
P(a « X <b) = D sx(s)— DY p(z) 


all zi <b all zi a 
Thus, by virtue of Eqs. 3.2 and 3. 4, 
P(a < X € b) = Fx(b) — Fx(a) (3.6) 


. EXAMPLE 3À 


For an example of a discrete random variable, consider again the problem of 


„bulldozers iri Example 2.1. 


Using X as the random variable, whose values represent the number of good 
bulldozers after 6 months, the events in the sample space S are mapped (naturally) 
into the discrete values of the real Iine as shown in Fig. E3.1a. 

Thus (X = 0), (X = 1), (X = 2), and = = 3) can be used to identify the respec- 
tive events of interest. 

If. the probability that a bulldozer will remain operational after 6 months is 


p = 0.8, then assuming the conditions between bulldozers to be statistically inde- 


pendent, the PMF of X becomes 

P(X = 0) = (0.2)? = 0.008 

P(X =1) = 3[0.8(0.2)7] = 0,096 

P(X = 2) = 3[(0.8)?0.2] = 0,384 

P(X = 3) = (0.8)8 = 0.512 
whereas P(X = x) = 0 for all other x. These results can be portrayed graphically 
as shown in Fig. E3.1b. The corresponding cumulative distribution function (CDF) 
would appear as in Fig. E3.1c. 


: Analyticaily, the PMF described above is given by the binomial distribution (see 
Section 3.2.3) with n =.3 and p = 0.8. 


EXAMPLE 3.2 


To illustrate a continuous random variable, consider the problem described in 
Example 2.14. If the volume of traffic and road conditions along the 100-km highway 


3.4. RANDOM VARIABLES 85 


m | 688 | 668 | o . 
m | BGB | GBG 8 Sample Space S 
BBG | BGG 


l 
7 | | | | Random Variable X ^ 
| | | i 
pitt 
[] i E x 


Figure E3.Ia 


Py (xp) 


Figure E3.1b PME of X 


FL 


Figure E3.1c CDF of X 


are about the same, the likelihood of accidents i is roughly uniform over the 100x km 
distance. If X is a random variable whose values denote the distance (from km 0) 
at which accidents occur, then the probability density function (PDF) of X is 
constant between 0 and 100 km; that is 


fx@ =c 
=. =0 


0 <x <10 N 
elsewhere ' 


_ where c = 1/100. Graphically, thisi is shown in Fig. E3.2a. The corresponding distii : 


` 86 ANALYTICAL MODELS OF RANDOM PHENOMENA 


f(x) 


0 100 
Figure E3.2a PDF of X 


FG) 


o : 100 


Figure E3.2b CDF of X 


bution function is 


x ye " 
LORS [rea = cx =~ O<x< 100 


=1.0 x > 100 
,-0 . : x«0. 


and graphically is as shown in Fig. E325. Then, for example, the probability. 


P(20 < X <35 — dx = 0.15 
( lm i 


“or, alternatively, using Eq. 3.6, 
PQ0 < X.< 35) = Fy(35) — Fx (20) 


3.1. RANDOM VARIABLES — 87 


fy(x) : | $m 


o 10 
Figure £3.38 PDF of X 


EXAMPLE33 | | 7. REEL 
Suppose.that a random variable-X has a PDF of the form 
/ fxe) = ax? 0<x<10 
s =0 elsewhere 


(See Fig. E3.3.) Under what condition (i.e. what value of «) is this function a.bona 
fide PDF? . - : 


In order to satisfy all the properties of a PDF, we must have ~ 


i ax? dx = 1.0 
0 
from which 
a 
bos 
z (10)? = 1.0 
and 
_ 3 
3 ~ 1000 ` 
The probability 
p> sy a1 pix <3) = — [ ae =1 - Logis 
a . Je 1000 1000 ` 


3.1.2. Main descriptors of. à random variable 


` The probabilistic ‘characteristics of a random variable would be describéd 


completely if the form of the distribution function (or equivalently its 


-probability density or mass function) and the associated parameters are 


specified. In practice, however, the form of the distribution function may 
not bé known; consequently, approximate description of a random variable 


88 ANALYTICAL MODELS OF RANDOM PHENOMENA 


‘is often necessary. The probabilistic characteristics of a random variable 
may be described approximately in terms of certain key quantities or main 
descriptors of the random variable; the most important of these are the 
central value of the random variable, and a measure of the dispersion of its 
values. A skewness measure may also be important and useful when the 
underlying distribution is known to be nonsymmetric. 

Moreover, even when the distribution function is known, the principal 
quantities remain useful, because they convey information on the proper- 
ties of the random variable that are of first importance in practical applica- 
tions. Also, the parameters of the distribution may be derived as functions 
of these quantities, or may be the parameters themselves (see Chapter 5). 


Mean or expected value (a central value). Since there is a range of pos- 
sible values of a random variable, we would naturally be interested in some 
central value, such as the average. In particular, because the different values 
of the random variable are associated with different probabilities or proba- 
bility densities, the “weighted averagé" would be of special interest; this is 
known as the mean value or the expected value of the random variable. ^ 

Therefore, if X is'a diserete random variable with PMF px(zi), its 
“weighted” average value, denoted E(X), is Y 

. B(X) = 95 sipx(z) 


all zi 


(3.7a) 


c 


Similarly, for a continuous random variable x w ith PDF fx), the mean: 


: value is 


EX) = f «fie de (3.76) 


ELI 


Mathematical expectation. The notion of a weighted average or ex- 
pected value can be generalized for a function of X. Given a function g(X), 
its expected vatue &[g(X)], obtained as a generalization of Eq. 3.7, is 


Elg(X)J= Do al) px(2) g (3.8a) 


all zi 


if X is discrete; whereas, if X is continuous, 


E90] = [^ 9(a)fe(e) dz (3:88) 


In either case, E[g(X) ] is known as the sithenalical expectation of gU). 
-' Other quantities that are used. also to designate the central value of à 


random variable include the. mode (or ‘modal value) and the median. 
hat is, it is 


the value of the random variable with the largest probability or the highest 


probability density. . 


3.4. RANDOM VARIABLES 89 


Fy(m,) = 0.50 (3.9) 


especially if the density function is not symmetric. 


Variance and standard deviation (measures of dispersion). Besides the 


central value, the next most important quantity of a random variable is its 
measure of dispersion or variability; that is, the quantity that gives a 
measure of how closely the values of the variate are clustered (or con- 
versely, how widely they are spread) around the central value. Intuitively, 
such a measure must be a function of the deviations from the central value. 
However, whether a deviation is above or below the central value should be 


Qu 


of no significance; consequently, the function should be an even function of 


the deviations. 

If the deviations are taken with respect to the mean value, iben a suitable 
average measure of dispersion js the variance. For a discrete random 
variable X with PMF px (z;), ihe variance of X is 


Var(X) = x (z: — ux)? px (as) (3.10) 
all ri n 
in which ux = E(X). We observe that this is simply the weighted average 
of squared deviations, or, in accordance with Eq. 3.8, it is the mathematical 
expectation of g(X) = (X — ux)”. Therefore, according to Eq. 3.8), if X 
is continuous with PDF fx(z), the variance is 


Var(X) = f^ (5 — ux)? fela) de (3.11) 


Expanding the integrand in Eq. 3.11, we have 


Var(X) = f (2? — 2uxz + ux*)fx(z) dz 
= B(X?) — AEQ Ray 
Thus a useful relation for the variance is av 


Var(X) = E(X?) — ux? © (8.12) 


In Eq. 3.12, the term E(X?) is known as the mean-square value of X. 
Dimensionally, a. more convenient measure. of dispersion is the square 
root of the variance, or the standard deviation c; that is, 


ox = M/Var(X) (3.13) 


RE 


90 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Tt is hard to say, solely on the basis of the variance or standard deviation, 
whether the dispersion is large or small; for this purpose, the measure of 
dispersion relative to the central value is more useful. In other words, 
whether the dispersion is large or small is meaningful only relative to the 
central value. For this reason, the coefficient of variation (COV), 


wis Bgm m oun (8.14) 
ux : 
is often a preferred and convenient nondimensional measure of dispersion 
or variability. P ] 
EXAMPLE 3.1 (continued) . l 
Referring back ‘to Example 3.1, the expected number of operating bulldozers at 


the end of 6 months is 


E(X) = 0(0.008) + 1(0.096) + 2(0.384) + 3(0.512) 
= 240 
This illustrates the fact that the expected value of a discrete random variable may 
not be a possible value of the random variable. 
The corresponding variance is . 
Var (X) = 0.008(0 — 2.4)? + 0.096(1 — 2.4* 
, --0.384(2 — 2.4)? +'0.512(3 — 2.4)? 
= 0.48 


' Using Eq. 3.12, we may compute the variance also as. 


Var (X) = [12(0.096) + 29(0.384) + 330.512] — (2.40)? 
= 0.48 : " 
The standard deviation, therefore, is _ 
d l ox = V048 = 0.69 
and coefficient of variation (COV) is Y 
ôx = a = 0.29 
EXAMPLE 3.3 (continued) 


For the random variable X with the density function of Example 3.3, the mean 
and variance are, respectively, 


10 (3x1 
zo |, it yd 


3.1, RANDUM VAKLABLES YE 


10 ` 3x2 
Var (X) = i (x — 7.5)? (ioo) de- 


3 


10 
= zo |, [xt — 1533 + (.5)yx*] dx = 3.75 


or, by Eq. 3.12, 


var | (2 
tj 
ar (X) j^ x (a) dx — (7.50)? = 3.75 


Therefore the standard deviation is 


; oy = V3.15 = 1.94 
and the corresponding COV is E 
: 1.94 
ôx = 75 = 0.26 


From Fig. E3.3, the modal value is obviously X = 10. To determine the median, 


Eq. 3.9 yields é 
Ta 3x? 
Í ioo 4* = 050 
from which we have 
x$, = 500 
Thus the median is t . 
Xm = 7.94 


EXAMPLE 3.4 


A contractor has an experience record that shows 60% of his projects are com- 
pleted on schedule. If this performance record prevails, the probability of the 
number of completions in.the next 6 jobs can be described -by the binomial dis- 
tribution (see Section 3.2.3) as follows: if X is the number of jobs completed among 
6 future jobs, then 


6 
P(X =x) = (i (0.65(04)7*. x = 0,1,2,...,6 
- - otherwise 


6 s 
() C xt(6 — x)! 


The mean number of jobs completed on schedule, therefore, is 


' where 


EQ) = $ x (0.6*(0.4)977 


1 


6 
= i( (0.6)(0.4 + x) (0.6)*(0.4)* 
3 


6 
* ( s (0.6)(0.4)* + «() (0.6)(0.4)* 


92 ANALYTICAL MODELS OF RANDOM PHENOMENA 


+5 (5) (0.6)5(0.4) + 6 (8) (0.6) 


= 0.03686 + 2(0.13830) +.3(0.27640) 
+ 4(0.31110) + 5(0.18660) + 6(0.04666) 
= 3.60 
Therefore the average number of jobs among 6 that can be completed on schedule 


is between 3 and 4. 
The corresponding variance is 


Var (X) = X (x —- seo] ($) oroa] 


a20 
= (—3.60)? () (0.4) + (—2.60) N (0.6)(0.4)* 


+ (—1.60)? N (0.6)?(0.4)* + (—0.60)* G (0.6)(0.4)* 


+ (0.4? (d) (0.60.4)? + (1.40)? (3) (0.6)°(0.4) 


+ (2.40? (8) (0.6) 


= 0.0531 + 0.2482 + 0.3539 + 0.0995 
+ 0.0498 + 0.1626 + 0.2684 
= 1.2355 


The standard deviation, therefore, is 
n ex = V12355 = 1.11 
and coefficient of variation (COV) is. 


5, 0X =r _ 0308 
°X ax 7 0360 07 


In this case, X = 4 has the highest probability; hence the mode is 4. 
EXAMPLE 3.5 


Suppose that the useful life T (inhours).of welding machines is notpredictable, but 
can be described with a, PDF known as the exponential distribution (see Section 
32.7) $ Ta 

: -fre =H 1:20 

=0 E E 
in which 4 is a constant parameter. Graphically, this probability density function is 
shown in Fig. E3.5a. In this case, the corresponding distribution function is 


t ^ " 
Fg(t) =. i Agde = e 


which is also shown graphically in Fig, E3.5b. 


3.1. RANDOM VARIABLES 93 


a tb) 


Figure E3.5 PDF and CDF of T 


The mean life of a weiding machine then is 
do 
Hp = E(T) = f: + Àe7?! dt 
; : o 
Integration by part will yield up = 1/4. Therefore, for the exponential distribution,’ 
the parameter 4 is equal to the reciprocal of the mean life; that is, 4 = 1/E(T). 


The mode 7 in this case is zero, whereas the corresponding median life t, is^ 
obtained as follows. According to the definition of the median, Eq. 3:9, - a 


in 
, Í Àe^?t dt = 0.50 
0 
thus obtaining 


: —inO.5 0.693 
"UG CIO 
Therefore is 
tm = 0.693 uy 
The variance of T' is ; ^ 
w nu 
ao f (:-3) je ?t dt 
Integration by parts yields ` : a 


94 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Thus the corresponding standard deviation is 


he 


"p = ip 


Therefore the COV of the exponential distribution is 
bp = 1.00 


Measure of skewness. : Another useful property of-a random variable is 
the symmetry or lack of symmetry of its probability distribution, and its 
associated degree and direction of asymmetry. A measure of this asymmetry 
or skewness is the third central moment, or 


E(X — ux)! = 0S (z; — ux) px (ui) for diserete X 


gil ži 


and 


E(X — ux) = J (% — ux)*fx(x)dz for continuous X 
Observe that E(X — ux)? is zero if the probability distribution is sym- 
metric about ux; otherwise it may be positive or negative. It will be positive 
if the values of X that are greater than ux are more widely dispersed than 
the dispersion of X < ux. On the other hand, this third moment about 
the mean will be negative if the reverse situatión is true. Therefore, the 
skewness of a random variable may be designated as positive or negative 
in accordance with the sign of the third moment E(X — yx)’; the magni- 
tude of this third moment gives: the corresponding degree of skewness. 
These properties are illustrated in Fig. 3.3. 


A convenient nondimensional measure. of skewness is the. skewness - 


coefficient, . 


8- pil Sad Ou (3.15) 
` g~. 


Analogies with properties of area. The mean value and the variance 
correspond, respectively, to the centroidal distance and the central momenti- 
of-inertia of an area. To see this, consider : a unit area having a generál 
shape shown in Fig. 3.4. 

The centroidal distance 2 of the area is 


fmm. 
area (à af(a) de . (8-16) 


which is also the first moment (about 0) of the irregular-shaped area. 


tl 


To = 


Uele  AMZALDTAPAZITA f caararnar asais -- 


Positive Skewnass 3. 
E (Xo- #9}? > ELX o2) 


tx, 


- Negative Skewness G 
Pie) "OEORq-AP €EQG-A2S 


Figure 3.3 Asymmetric PDF. 


The moment of inertia about the vertical centroidal axis is 
= f" mi) de (Gam 


. Comparing Eqs. 3.76 and 3.11 with Eqs. 3.16 and 3.17, respectively, we 
see that the mean value is equivalent to the centroidal distance, whereas 
the variance is equivalent to the centroidal moment of inertia of an area. 

In this regard, we ean therefore refer to the mean value as the ‘first 


Figure 3.4 An irregular area 


dar um ene LA NERTRARAT UE ARAL £ ARKILYAZITARZL VAR. 


moment, and the variance as the second (ceniral) moment of a random 
variable. More generally, extending this terminology, we shall refer to 


EP) = f" ofi) de (318) 


L0 


as the nik moment of X. 


MOMENT-GENERATING AND CHARACTERISTIC FUNCTIONS* 


The approximate description of a random variable (discussed in Section 3.1.2) 
can be improved with a knowledge of its higher-order moments. Indeed, if all the 
moments of a random variable are known, its probability distribution would also 
be completely described. This means that a function through which all the moments 
can be generated is an alternative way of describing the probability law of a random 
variable; such a function is a moment-generating function. Its complex form is a 
characteristic function. 

The moment-generating function of a random variable X, denoted 'Gx(s), is 

_ defined as the expected value of e*X; that is, 


Gx(s) = E(e'X) | (3.19) 


where s is an auxiliary (deterministic) variable. 


_ Therefore, if X has a PDF f(x), the corresponding moment-generating function 
is 


e 
Gx) = I e fy(x) dx ... 20a) 
whereas if X is discrete with PMF "M 
Gx = Y etipy(x) .— (3.205) 
all zy ‘ 


From Eq. 3.20a we’ observe that 


220 "Pix E Af x (3) dx 
"Therefore . : 
j Sx) = EX), the expected value of X. 
Similarly, ` * 
l TO o -[. sf GO de = EO 
' and, in general, . . . ! 
ege = E xfx) dx = - E(X") . (3.21) 


Therefore the nth moment ofa a random variable is given by.the nth derivative of its 
moment- -generating function evaluated ats = 0.. 


* his ‘section is presented here only for mathematical definition; ‘the material is not 
necessary for understanding the remainder of the book. ` 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 97 


‘ 


\It can also be shown that the variance is given by 


Var Q0. — In Gx(0) (3.22) 
The characteristic function of X is defined as 
$x() = E(e**) = 6x2 . 923) 
in which i = V —1. Therefore 
ox) = I "e f) dx 20 (3242) 
" . = ] z 
$a) = & er pan) l (3.24b) 
v $ 


In terms of the characteristic function, the nth moment of X is given by 
1 d^$x(0) 


E(X”) = @ as" (3.25) 
whereas the special relation for the variance is f 
3 í ia i 
Var (X) = Oe In dx (0) (3.26) 


3.2. USEFUL PROBABILITY DISTRIBUTIONS § _ 
Any function possessing all the properties cited earlier (in Section 3.1.1) 


. can be used to describe the probability distribution of a random variable. 


However, there are a number of discrete and continuous functions that are 
specially useful because of one-or more of the following reasons: (1) The 
function is the result of an underlying physical process and is derived on the 
basis of certain physically reasonable assumptions; (2) the function is the 
result of sóme.limiting process; and (3) it is widely known and the neces-- 
sary statistical information (including probability tables) is available 
widely. Several of these probability distribution functions are presented 
and their special properties described i in this section. 


‘Perhaps the best-known and most widely used probability distribution is 
the normal distribution, also known as the Gaussian distribution. The normal 
distribution has a Probability density function given by 
te = ; 
Jee) = gl- X gil E <h < o Gan 


T 


where u; and e: are the parameters of the distribution, w. hich are alsó the 
mean and standard deviation, respectively, of the variate. A short notation 
for this distribution is N (y, o), which we shall adopt. 


98 : ANALYTICAL MODELS OF RANDOM PHENOMENA 
O6, 


04 


Figure 3.5 Density functions of the standard normal distribution 


; : | 
The standard normal distribution. `A Gaussian distribution with 
‘parameters u = 0 and o = 1.0 is known as the standard normal distribu- 
tion and is denoted appropriately as N (0, 1). The density function, ac- 


cordingly, is ^ 


f(s) = VE Ut Lo Zs«o (3.27) 


" Several density functions of N(0, 1) are shown graphically in Fig. 3.5; 
of some interest.are the total probabilities within-a specified number of 
standard deviations from the mean (whicli is zero), as shown in Fig. 3.5. 
Observe that the density function of'N(0, 1) is symmetric about zero. 
Because of its wide usage, a special notation (s) is commonly used to 
designate the distribution function of the standard normal variate S; that is, 


xla 4 £o m Flu perse e, ty fc) ipQ) 


$4.2. USEFUL FHUDADILIII JiOI DAIDUIIULY; ar 


` f(s) 


ao 


Sp c ; s 
Figure 3.6 The standard normal density function 


Probability = p 


(s) = Fs(s), where S has N (0, 1) distribution. Referring to Fig. 3.6, 
we have 


$(s) = P 


Conversely, the. value of a standard normal variate at a cumulative proba- 
bility p would-be denoted as ` 


Sp = (p) 


This notation will be used throughout this book. j 

The distribution function of N(0, 1), that is, (s), is tabulated widely . 
as tables of normal probabilities—for example, Table A.1 of Appendix A. 
Observe from Table A.1 that the probabilities are given only for positive 
values of the variate. This is because by virtue of symmetry of the standard 
normal PDF about zero, the probabilities for negative values of the variate 
can be obtained as i à : 


@(—s) = 1 — &(s) (3.27b) 
By the same token, values of s corresponding to p < 0.5 may be obtained as 
$= 43(p) = —-(1 — p) t (3.27) 


. With the table of (s), probabilities of any other normal distributions ` 
can then be determined readily as follows. Suppose a normal variate X with 
distribution N (u, e); the probability ` 


ee X <b) = ES |- {= )le 

m T: ovV/2r Ja x 2N c ] 
Clearly this is the area under the normal curve between a and b, as shown 
in Fig. 3.7. Theoretically, the required probability can be obtained by 


100 ANALYTICAL MODELS OF RANDOM PHENOMENA 


fo) 
Figure 3.7 PDF for N(y, c) 


evaluating the preceding integral directly; however, this can be done also 
by making the following change of variable: : 


s = and dx = cds 
Then 
1 ft» : 
P(ia«X«&b)- zv. gum c dg 
SE 6—2)/e i ! 
T V li ae 


which may be recognized to be the area of the standard normal density 
function between (a — »)/o and (b — u)/e, and thus according to Eq. 3:6 
can be determined also as 


Pa«X« " - «t = e | «( = 2 T 


o 


EXAMPLE 3.6 


Suppose, from historical record, that the total annual rainfall in a catch basin 
is estimated to be normal N(60 in., 15 in.). i 
(a) What is the probability that in future years the annual rainfall will be between 


‘40 and 70 in.? 


3.2. USEFUL FKUBABILITY DISTRIBUTIONS ^` 101 


According to Eqs. 3.28 and 3.275, this probability is 
70 — 60 40 — 60 
Pao «x <70) = (=) - s(* 55) 
i = 0(0.67) — 0(—1.33) 
; = 0(0.67) — [1 — 9(1.33)] 
From Table A.1 we therefore obtain the probability 


P(40 < X < 70) = 0.748571 — (1 — 0.908241) 
= 0.6568 


(b) What is the probability that the annual rainfall will be at least 30 in.? 


30 — 60 
P(X > 30) = 0(o) — o=") 
= 1 -— 0(-2.00) = 1 — zt — 0(2.00)] = (2.00) 
.=0.9772 ` 


(c) What is the 10-percentile annual rainfall in the basin (that is, the value of the 
variate at which the cumulative probability is 10%)? In other words, the probability 
that the annual rainfall will be less than the 10-percentile value is 10%. 

In this case, we wish to determine x 19 so that 


P(X € x19 = 0.10 
Xe — 60 1 = 
oz ) = 0.10 


Observing from Table A.1 that probabilities less than 0.50 are associated with 
serene values of.the variate, and using Eq. 3.27c, we have 


Therefore 


X19 — 60 
15 


Hence the 10-percentile annual.rainfall is 
X349 = 60 — 1.28(15) = 40.8 in... 


= 0-10.10) = —6-1(0.90) = —1.28 


EXAMPLE 3.7 


A shell structure is resting on three supports, A, B, and C, as shown in Fig, E3.7. 
Even though the loads from the roof, transmitted to the three supports can be 
estimated accurately, the soil conditions under A, B, and C are not completely 
predictable. Assume that the settlements p4, pg, po -are independent normal . 
variates with means 2, 2.5, 3 cm (centimeter) and coefficients of variation 20 ym 
20%, 25 %, respectively. ; 


(a) What is the probability that the maximum settlement. will exceed 4 em? ! 

(b) If it is known that A and B have settled 2.5 and 3.5 cm, respectively, what is _ 
the-probability that the maximum differential settlement will not exceed 0. 8 cm? 
That it will not exceed 1.5 cm? à 


t 


102 ANALYTICAL MODELS OF RANDOM PHENOMENA 


c 


Figure E3.7 


Solutions . ^. ., 


(a) P(max p > 4em) = į — P(max p < 4cm) 
-1-—P(p4 40^ pp $420 pe &4) 
=1 = Plea < 4) Plog S4) Plog <4) 


1 4 —2Y [4 25V, (4 —3 
m "(e 04 ts) ss) s) 
= 1:— 0(5):0(3)- 0(1.333) 
—1 — 1 x 0,9986 x 0.9088 
= 0.0925 
` (b) Since the differential settlement between A and B, that is, A5 = 3.5 cm — 


2:5cm = 1 em, 
P(A max € 0.8 cm) = 0 


regardless of what pe is... 


For the event A, is less than 1.5 cm, we have to know what pcis. If pg < 1 cm 
or pe > 4 cm, then Ayg > 1.5 cm; also if pg < 2cm or pg > 5cm, then Ago > 
1.5 cm. From these two conditions we see that the acceptable region of pg is 
Q em. & pe € 4 cm). Any other values of pc will definitely give riseto a maximum 
differential settlement exceeding 1.5 cm. Therefore 


P(Anax < 1.5 cm) = P(2cm < po S 4cm) 
4-3 2=3 

(og 0.75 )- of 0.75 ) 
= 041.333) — $(—1.333) 


= 0.9088 — 0.0912 
= 08176 


A random variable X has a logarithmic normal (or simply log-normal) 


' probability distribution if In X: (the natural logarithm of X) is normal. 


` 3.2. USEFUL PROBABILITY DISTRIBUTIG , 105. 


£ «04 


Medion = LO 


Figure 3.8 Log-normal density functions 


In this case, the density function of X is 
i a/m- -— 
m€——— a2 <a i 
Jx(z) Vin ke exp] X f jl O<a< om (3.29) 


sre commen = waa the parameters of the ee 


Equation 3.29 is illustrated in Fig. 3.8 for various values of t. 
Because of its relationship with the normal distribution: (that is involving 
a logarithmic transformation), probabilities associated with a log-normal 
variate can also be determined using the table of standard normal. d 
bilities. We show this as follows. 

On the basis of Eq. 3. 3, the probability that X w ill assume dilisi inan 
interval (a, b] is 


P(a«X&5b- | VE m e[- (C A xy ]« 


Let 


10 ANALYTICAL MODELS OF RANDOM PHENOMENA 


then dz = xf ds, and 


$ xx 1 (In bN? ' 
< f —(1/2)8 
is « )= Va lasst € ds 


«(5 = J = (2 = >) (3.30) 


In view of this convenient facility for calculating probabilities of log-normal 
variates and also because the values of the random variable are always 
positive, the log-normal distribution may be useful in those applications 
where the values of the variate are known to be Strictly positive; for ex- 
ample, the strength and fatigue life of material, the intensity of rainfall, 
the time for project completion, and the volume of air traffic. 

We observe from Eq. 3.30 that the probability is a function of the param- 
eters \ and t. These parameters are related to the mean p and standard 
deviation c of the variate as follows. AN 

Let Y = In X, which is N (A, t). It follows that X = e¥ and 


n = E(X) = E(eY) 


1 = Ify- ` 
BEV Sai "Arf le 
1 


By completing the square on the exponent,.we obtain 


p T9 Iu- Atay ; f 
E (G2) a neam 
We recognize that thé quantity within the, brackets i is the total unit area, 


of the Gaussian density function N(A + i2, §); henee 


u = exp(d + d) 
Thus we have 


A= Ing — Be - (8.81) 
Similarly, we derive the variance of X as follows: 


H Caper ify — AM 
are eos AG") | 


Et i al: 25 E —2( 2/0y + colas 


E(X’) 


Li 


À 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 105 


By completing the square on the exponent, the above integral yields 


sm = [safe E 899 a] ee 


exp[2(a + (1. 
Thence, according to Eq. 3.12, and using Eq. 3.31, 
Var(X) = exp[2(a + )] — expl2Q + 2°) ] 


i = n E ^ f (1t a 
from which we obtain p 
ea in(1 + “) $ (3. iy 


E 


1 


If c/p is not large, say <0.30, In[1 + (0/#)] = c?/ i7. In such cases, 
therefore, . 


(3.320) 


The median is often used to designate the central value of a log-normal 
variate. If £m is the median, then by definition, m 3.9, 


P(X& Im) = 0.5 


or 
In £m — ) 
a —-—— ] = 0.5 
ar 
Thus 
inte — ^. g-1(0.5) =0 
€ 
Therefore, in terms of the median, the parameter ^ is 
A = In te ; (3.33) 
conversely, : 
d Ta = A ; (3:33a) 


Comparing Eqs. 3.31 and 3.33 and using Eq. 3.32, we obtain the relation 
between the. mean and median of a log-normal variate as 
e mee z 

ENE GE we 
median = m= VET & , (3.34) 


106 ANALYTICAL MODELS OF RANDOM PHENOMENA 


This means then that the median of a log-normal variate is always less than 
its mean value; that is, tm < u. 


EXAMPLE 3.8 


In Example 3.6, suppose that the annual rainfall has a | istributi 

. > SUP og-normal distribut 
(instead of normal) with the same mean and standard deviation of 60 and 15 in: 
respectively. What would then be the answers to the questions raised in Example 3. 6? 
We first obtain the parameters 2 and as follows. Using Eq. 3.32a, b 


15 
tœ 667 0.25 
and from Eq. 3.31 ` 
A = ln 60 — 4(0.25)? = 4.09 — 0.03 = 4.06 


(a) In this case, the probability that the annual rainfall will be between 40 and 


70 in. is 
P(40 < X < 70) o (1070 — 4.06) _ (1n 40 — 4.06 
0.25 0.25 


. = 0(0,75) — $(—1.48) 
" . = 0.773373 — 0.069437 = 0.7039 
(b) The probability that the annual rainfall will be at least 30 in. is 


P(X > 30) = 1 — o (130 — 406 
0.25 


= 1 — 0(—2.64) = 0.9958 
(c) The 10-percentile annual rainfall is 


o (Bas Ae xu 2) = 010 


025 
In x49 — 4.06 . 
0.25 rebas 
| In x. = 4.06 — 1.28(0.25) = 3.74 i 
« Therefore 


E X39 = T = 42.10 in. 


3.2.3. Bernoulli sequence and the binomial distribution 


. Problems of concern to engineers and engineering planners sometimes re- 
quire the consideration of the potential occurrence or recurrence of an 
event in a sequence of repeated “trials.” For example, in allocating a fleet 
of construction equipments for & project, the anticipated conditions of 
every piece of equipment in the fleet over the project duration would have 
some bearing on the determination of the required fleet size; whereas, in 
planning the flood control system for a river basin, the annual maximum 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 107 


flow of the river over a sequence of years would be important in the deter- 
mination of the design flood. In these cases, the operational conditions of 


. each piece of equipment, and the maximur flow of the river each year rela- 


tive to a specified flood level, constitute the respective trials. These prob- 
lems are also such that there are only two possible outcomes in each trial, 


. namely, the occurrence or nonoccurrence of an event—each piece of equip- 


ment may or may not malfunction over the duration of the project; each 
year, the maximum flow of the river may or may not exceed some specified 
flood level. 

Problems of the type described above inay be modeled by a Bernoulli 
sequence, which is based on the following assumptions: ] 


.1. Each trial has only two possible outcomes: the. occurrence or non- 


. occurrence of an event. . 
2. The probability of occurrence of the event in each trial is constant. 
3. The trials are statistically independent. 


Therefore, in the examples cited above, if the operational conditions be- 
tween equipments are statistically independent and the probability of mal- 
function for every piece of equipment is the same, then the conditions of 
the entire fleet of equipments constitute .a Bernoulli sequence. Similarly, 
if the annual maximum floods are statistically independent, and each year 
the probability of the annual flood’s exceeding some specified level is con- 
stant, then the annual maximum floods over a series of years also constitute 
a Bernoulli sequence. 


`The binomial distribution. . If the probability of‘ occurrence of an 


event in each trial is p (and probability of nonoceurrence is 1 — p), then 
the probability of exactly x occurrences among n trials in à Bernoulli se- 
quence is given by the binomial PMF as follows: 


PX =a) (Irap 20512... 839 


where n and p are parameters, and (7) = nl/[x!(n — x)!] is the binomial 
coefficient (see Appendix B). The PMF for such a distribution with p.— 
0.80 and n = 3 was illustrated earlier (Example 3.1). 

We observe that the probability of realizing a particular sequence of 
exactly z occurrences of the event among n trials is p*(1 — p)"77. However, 
the specifie sequence of trials in which the event may oceur x times can.be 
permuted among the n trials, so that the number of distinct sequences with 
exactly z occurrences is (7) ; for example, if there are x breakdowns among a 
fleet of n pieces of equipments, the z breakdowns may be realized in (3) 
different sequences. Thus we obtain Eq. 3.35. ` 


108 ANALYTICAL MODELS OF RANDOM PHENOMENA 
Rn 


Log-normol: p * 1500 hr 
5:030 


Š ij i 
POT 
o Mak Š - i 
Rm Life, hra i mE E EE Ru à m 
Figure E3.9a Figure E3.9b 
EXAMPLE 3.9 


Suppose that five road graders are used in a hi i j " 

. y f a highway project. The i 

ia a ers Sapient hi id to have a jos ubi distibgtion VC dT 
0 hr and a of 30% (see Fig. E3.9a). Among the five i i 

e What is meg probability that two of them will malfunction in Mes ens 900 hr 
on? istical i iti 

x dn Ssume statistical independence between the conditions of the 


Each grader may or may not malfunction afte; 1 i 
i y ot r 900 hr of o ili 
of malfunction within this period js determined as ioe done 


t = 0.30 : 

A —1n1500 — 4(0.3} = 7.27 
Therefore the probability of a machine malfunctioning in 900 hr (see Fig. E3.9a) is 
in 900 — 2) 


P = È(T < 900) = «( 
0.30 


< = 9(—1.56) = 0.0594 


For the five machines taken collectivel 1 i ive 

] nes ti ly, the actual operational liv = 
seivaby be as shown in Fig. E3.9b. That is, as illustrated in Fig. E3.9b, icis 

and 4have operational lives less than 900 hr. The probability that machines land4 
will malfunction within 900 hr whereas 2, 3, and 5 remain operational is p*(1 — p)* 
But the two malfunctions may happen to any two among the five machines: Fas 
Sequently, if X is the number of machines malfunctioning in 900 hr, ' 


5 
PX =2)= C) (0.0594):(0.9406)* 
5! 
= 21310-0594) (0.9406} 
=00294 
The probability of malfuncti : É i ion i 
onan proba pane bale c a among the five graders. (that is, malfunction in 
PX Z1) =1 — P(X =0) > 
= 1 — (0.9406} .— 0.2638 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 109 


In spite of its simplicity, the Bernoulli model is quite useful in many 
engineering applications. Engineering problems involving situations with 
only two alternative possibilities are numerous. Aside from those cited and 
illustrated above, other problems of this nature include the following. In a 
series of soil borings, each boring may or may not encounter boulders; in 
‘monitoring the daily water quality of a river on the downstream side of an 
industrial plant, the water tested daily may or may not meet the pollution 
control standards; the individual items produced on an assembly line in an 
?ndustrial plant may or may not pass the inspection to assure product qual- ` 
ity; and a nuclear power plant may or may not be hit by a tornado in a 
year. In each case, if the situation is repeated, we have a Bernoulli se- 
quence.. ~ e ] 

It should be emphasized also that: iri modeling problems with the Ber- 
noulli sequence, the individual trials must be discrete. In spite of this re- 
quirement, however, certain continuous problems may be modeled (ap- 
proximately at least) with the Bernoulli sequence. For example, time and 
space problems, which are generally continuous, may be modeled with the 
Bernoulli sequence by discretizing time (or space) into finite intervals and 
‘admitting only two possibilities within each interval; what. happens in 
each time (or space) interval then constitutes a trial, and in the series of 
intervals a Bernoulli sequence. Consider for example the following. | 


EXAMPLE 3.10 


In planning the flood control system for a river, the yearly maximum flood of ~ 
- the river is of concern. Suppose that the probability of the annual maximum flood 
exceeding some specified design level A is 0.10 in any year; what is the probability 
that the level Ap will be exceeded once in the next five years? 

In this Case, we observe that the natural time interval is one year, and within 
each year there is only one maximum flood that may or may not exceed the level Ag. 
Therefore the series of annual floods can be modeled as a Bernoulli ‘sequence. 
Furthermore, assuming that Ap is high enough that there is no likelihood of its 
being exceeded more than once a year, the number of exceedances of level fg, 
therefore, has a binomial distribution. On this basis, if X is the number of exceed- 
ances of flood level Aij in the next five years, then we have 


. PX =1)= () (0.1) (0.9)! .— 0.328 


The probability that there will be at most one exceedance of level hy (that is; . 
one or none) in the next five years is : : 


P(X <1) =PX.=09 -PX-1) 
= (0) (0.10.95 ++ () (9.1 (0.9)** 
= 0.590 + 0.328 = 0.918 


10 ANALYTICAL MODELS OF RANDÓM PHENOMENA 


3.2.4. The geometric distribution 


In a Bernoulli sequence, the number of trials until a specified event occurs 
for the first time is governed by the geometric distribution. We observe that 


if the first occurrence of the event is realized on the tth trial, then there 


must be no occurrence of this event in any of the prior (t — 1) trials. 
Therefore, if T is the appropriate random variable, 


P(T =i) = pq aa: ee i , (8.36) 
Which is known as the geometric distribution. | f 


The return period. . In a time (or space): problem that can be modeled 
as a Bernoulli sequence, the number of time (or space) intervals until the 
first accurrence of an event is called the first occurrence time. 

We observe that if the individual trials (or intervals) in the sequence are 
statistically independent, .the first occurrence time must also be the time 
‘between any two consecutive occurrences of the same event ; that is,.the 
recurrence time is equal to the fitst occurrence time. : = 
: The recurrence time, therefore, in a Bernoulli Sequence also has a ge- 
ometric distribution; the mean recurrence time, which is popularly known 
in«engineering as the (average) return period, therefore, is. ^: 


T = E(T) = Xt po^ = p(l + 2q + 3g + +) 
: . = j 
For g < 1.0, the infinite series in the parentheses yields 
. ' A pes 
(1 = 9)? 


1 LA 
p 
Hence 


a P= 


Bir 


Equation 3.37, therefore, means that on the average the time between two 
consecutive occurrences of an event is equal to the reciprocal of the proba- 
bility of the event within, one time unit. It must be emphasized that the 
return period is only an average duration between events, and should not be 
construed as the actual time between occurrences; the actual time is T, 
which is'a random variable. — i à 


EXAMPLE 3.11 


A radio transmission tower is designed for a “50-year wind," that is, a wind 
velocity having a return period of 50 years. L 


(3.87) ` 


` 


3.2. USEF’ UL PROBABILITY DISTRIBUTIONS Wi 


: (a) What is the probability that the design wind.velocity will be exceeded for the 
first time on the fifth year after completion of the structure? — ^, m 
In this case, the probability of encountering the 50-year wind-in any one year is 
p = 1/50 = 0.02. The required probability then is 
P(T = 5) = (0.02)(0.98)! = 0.018 . 
(b) What is the probability that the first such wind velocity will occur within 5 
years after completion of the structure? i 


5 
P(T < 5) = > (0.02)(0.98)* 1 
" £e s 
= 0.02 + 0.0196 + 0.0192 + 0.0188 + 0.0184 
= 0.096 


t nized i isi f at least one 50-year wind 
Jt may be recognized that this:is the same.as the event o ee 
in 5 years; thus the desired probability may also be obtained as 1 — (0.98)° 0o: 
However, the above is different from the event of experiencing exactly one 50-year 


wind in 5 years; the probability in this-case would be 1 (0.02)(0.98)4 = 9:092: 


EXAMPLE 3.12 


C is desi i bove the mean sea level 
ffshore structure is designed for.a height of 8 m a > 
ee E312). This height corresponds to'a 10% probability of being d by 
eevee in a year. What is the probability that the structure will be subjected to 
waves exceeding 8 m within. the return period of the design wave height? 


The return period of the design wave is " E "xr 
' 4 1 I 


- Figure E3.12 


11? ANALYTICAL MODELS OF RANDOM PHENOMENA 


Therefore : M 
P(H > 8 m in'l0 years) =-1 —(0.90)? = 1 — 0,3487 
, = 0.6513 . 

If it is assumed that, when subjected to waves exceedin i i 
. su ] g the design height, there 
is a probability of 20% that the structure may be damaged, what M the probability 
of damage to the structure within 3 years? 

This probability should take into consideration that there may be 0, 1, 2,.0r 3 
exceedances in 3 years, assuming the likelihood of more than one such wave in a 
year is negligible. Furthermore, assume that structural damages from more than 


one exceedance are statistically independent. Then, according to the 5 
ability theorem, : P Ee g to total prob: 


P (no damage in 3 years) = 1.00(0.90)* 

E + 0.80(3(0.10)(0.90?] 
+ (0.80)*[3(0.10)*(0.90)] - 
+(0.80)5(0.10)* 


= 0.9412 
Therefore *- 
P (damage in 3 years) — 0.0588 


s ; Observe now that the probability of an event occurring within ‘its re- 
turn period Tis.) ~ SU Aw. Siete, Tf EN i UC 
P (no occurrence in. T). — (1— p)T.. > : 
where p = 1/7, Expanding the above with the binomial theorem, 
UP 1 O PP —2). - 
2i p- ( a Dee 


(l-p)? =1-— fp4 


But for large 7 (and thus small f), it may be recognized that this is also 
approximately equal to e~7», Hence, for large T, 


P (no occurrence in 7) c e-T» = e = 0.368. 
"Therefore 
P (occurrence in T) ~ 1 — 0.368 = 0.632 for large T' 


In other words, for a rare gvent (that is, large 7) the probability of the 
event’s occurring within its réturn period is always 0.632. This result is a 
. useful approximation even for return periods that are not very long; for 
instance, for T = 10 (time units). 

Ai 1\". 
P (occurrence in T) = 1 — ( - 5) = 0.651 


which shows that the error in the above approximation is less than 3%. à 


3.2, USEFUL PROBABILITY DISTRIBUTIONS 113 


3.2.5. The negative binomial distribution i 


We. saw that the geometric distribution is the. probability law governing 
the number of trials (or time units) until the first occurrence of an event in 
a Bernoulli sequence. The time until a subsequent occurrence of the same 
event is governed by the negative binomial distribution. That is, if T, is the 
number of trials until the kth occurrence of the event in a series of Bernoulli 


trials, then 


P(T, = t) = k z i) ae, Aone Rh Dui. 088) 


=0 C fort <k 


Uf the kth occurrence is realized at the ith trial, there must be exactly 
(k-— 1) occurrences of the event in the prior (t — 1) trials and at the ith 
trial. the event also occurs. Thus, from the binomial law, 


P(T: = t) = m " Lene 


` yielding therefore Eq. 3.38. 


EXAMPLE 3.11 (continued) 

In the problem of Example 3.11, what is the probability that a second 50-year 
wind will o¢cur exactly on the fifth year after completion of the structure? - 

From Eq. 3.38, the required probability is 


AN A 
‘P(T, = 5) = (P) coos 
= 0.0015 : 


EXAMPLE 3.13 


Suppose that a cable is composed of a number of independent wires (see Fig. 
E3.13). Occasionally the cable is subjected. to high overloads; on such occasions 
the probability that one wire will fracture is 0.05. Assume that the failure of 2 or 
more wires during an overload is unlikely. If the cable must be replaced when 3 of 
the wires have failed; determine the probability that the cable can withstand at 
least 5 overload applications before being replaced. : : 


114 ANALYTICAL MODELS OF RANDOM PHENOMENA 


j 


fes 


Figure E3.13 


This means that the third failure must occur at or after the sixth overloading. 
Hence, using Eq. 3.38, the required probability is 


P(T; > 6) =1 — P(T; <6) 
=1 -(3) (0.05)? — () (0.059(0:95) — () (0.05)°(0.95)? 


= 1 — 0.00116 
= 0.99884 


3.2.6. The Poisson process and Poisson distribution 


Many physical problems of interest to engineers involve the possible 
occurrences of events at any point in time and/or space. For example, 
fatigue cracks may occur anywhere. along a continuous w eld; earthquakes 
could strike at any time and anywhere over a seismically active region; and 
traffic accidents could happen at any time on a given highway. Conceivably, 


such space-time problems may be modeled also with the Bernoulli sequence, . 


by dividing the time or space into small intervals, and assuming that an 
event will either occur or not occur (only two possibilities) within each 
interval, thus constituting a trial. However, if the event can occur at any 
instant (or at any point in space), it may occur more- -than oncé-at a given 
time or space interval. In such cases, the occurrences of the event may be 
more appropriately modeled with a. Poisson sequence or Poisson process. 

. Formally, the Poisson process is based on the following assumptions. 


i. An event can occur at random and at any time or any point i in space. 


` rences of the event per unit time (or space) interval. It follows then that : 


3.2. USEFUL PROBABILITY DISTRIBUTIONS us 


- "The occurrence(s) of an event in a given time (or space) interval is 
- independent of that in any other nonoverlapping intervals: 


3. The probability of occurrence of an event in a small interval At is^ 
proportional to At, and-can be given by vAt, where v is the mean rate of - 


occurrence of the event (assumed to be constant) ; and the probability 


of two or more.occurrences in Af is negligible (of higher orders of At).. 


On the basis of these assumptions, the number of occurrences of an event 
in ¢ is given by the Poisson distribution; that is, if X, is the number of:oc- 
currences in time (or space) interval t, then 


COM 
z! 


P(X,=2) =e $2012... . (839) 


where v is the mean occurrence rate; that is, the average number of occur- 


E(X.) = vt; it can be shown that the variance of X ; is also vt. 

A derivation of Eq. 3.39 based on the preceding assumptions is given in 
Appendix C. 

The similarities and differences between the Bernoulli sequence and the 
Poisson process may be clarified with the following illustration. Suppose 
that, from a previous traffic count; an average of 60 cars per hour was ob- 
served to make left turns at an intersection. What is the probability that. 
exactly 10 cars will be making left turns in a 10-minute interval? 


An approximate solution would be to divide the hour into 120 30-seeond ` 


intervals; the probability of a.left turn in any 30-second interval would be 


Then, assuming no more than one left turn is possible in a'30-second inter- 
val, the problem is reduced to the binomial probability of 10 events in 20 


. trials, in which the probability of an event in each trial is 0.5; thus 


P (10 L.T. in 10 minutes) = (75) 0.505)" 


Physically, the solution is approximate because it is implicitly assumed 
that no more than one car would be making left turns during any-30-second 
interval; obviously, 2 or more left turns are actually possible. 

The solution would be improved if a shorter time interval was chosen. 
For example, if 10-second interv - s were used, then p = 60/360 = 1/6, and 


60—10 
P (10 LT nd 10 minate) = (ey O 


N 


116 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Further improvements can be made by taking even shorter time inter- 
vals. In general, if the time t is divided into n equal intervals, then . 


P E events in time t) = OSG - 2 


where ^ is the average number of events in time t. If an event can occur at 
any time (as in the case of left-turn traffic), the process may tend to the 
case w th n — ©; then : 


P (a events in t) 


Oe - 
tile ama). | 


maf OT 


nl 


But 
MAPS S pon oe! 
tim(1 =) sl-A\+ 5 rato =e, 
Therefore the limit yields 


i x 
P (z events int) = zi 


e? 
. which is the Poisson distribution, Eq. 3.39, with à = »t. 


EXAMPLE 3.14 


Historical records of rainstorms in a town indicate that on the average there had 
been 4 rainstorms per year over the last 20 years. Assuming that the occurrence of 
rainstorms is a Poisson process, what is the probability that there would be no 
rainstorms next year? 


0 
P(X, 20) o S e 


0! 
f = 0.018 
The probability that exactly 4 rainstorms will occur in the next year is given by 
. ^ 
PX, ims 4) = a 
= 0.195 


3.2. USEFUL PROBABILITY DISTRIBUTIONS - 117 


o123456789DHI 2 A x 
Figure E3.14 PMF of number of rainstorms in a year - 


This last result indicates that although the average yearly occurrences of rainstorms 
is 4, the probability of häving exactly 4 rainstorms in a year is only about 20%. 
The probability of 2 or more rainstorms in the next year is 


2 eo 4* "P 
EU. 
l g-i4m 


=1- 


£e x! 
‘= 1 — 0.018 — 0.074 
= 0.908 


The PMF of the number of rainstorms in a year is shown graphically in Fig. E3.14. 


EXAMPLE 3.15 (. Design of left-turn bay) 


For the purpose of designing a left-turn bay at a highway intersection, the left 
turns of vehicles may be modeled as a Poisson process. If the cycle time of the traffic: . 
light (for left turns) is 1 minute, and the design criterion requires a left-turn Jane - 
that will be sufficient 96% of the time (which is the criterion in California), what 
should the lane distance (in terms of car lengths) be to allow for an average of 100 
left turns per hour? a 


Solution 


Let k car lengths be the design length of the Jeft-turn lane. The mean rate of left 
turns is » = 100/60 per minute. Therefore, during a 1-minute cycle of the traffic 


light, the probability of no more than & cars waiting for left turns must be at least 
96%; thus - 


ac x! 


S k m 
Pa <b => a) 100/60 = 0.96 


118 ANALYTICAL MODELS OF RANDOM PHENOMENA 


By trial-and-error, we obtain 
| ifk=4, P(X, <4) = 0.968 
ifk=3, P(X, <3) —091 


Therefore a left-turn bay of 4 car lengths is sufficient to satisfy the design require- 
“ments. f ' 


EXAMPLE 3.16 


The street width at a school cfosswalk is D ft, and a child crossing the street 
walks at a' speed of 3.5 ft/sec. In other words, it takes a child ¢ = D/3.5 sec to 
cross the street. ao ; e ; 

Suppose 60 free intervals (z seconds each) in an hour, on the average, are desired 
at this crossing; how much average traffic volume can be allowed at this crosswalk 
before crossing controls will be necessary? Assume that cars crossing the crosswalk 
constitute a Poisson process. . . 

The number of ¢-sec intervals in an hour is 3600/r, whereas in an interval of t sec 
the probability of no cars passing through the crosswalk (according to Eq. 3.39) is 
et, if v is the average vehicular traffic per second. Therefore the maximum average 
traffic volume that can be allowed is such that the mean number of free intervals 


equals 60; that is, 
em e^t = 60 


or od 
3600 3.5 . psg = 60° 
From which ` poh 
p = 25 jp 3600 x 3.5. 

a "OR D 60D" © 

For D — 25 ft, 3 i 
v= = In M A 0.298 vehicles/sec 
i = 1073 vehicles/hr 


For other street widths D, the corresponding traffic volumes are as. follows: l 


D(f):: 25 40 60- 75 
»(veh./hr): 1073 522 263 173 


"Therefore, for various street widths, the hourly traffic volumes given above are 
the maximum traffic flow that can be allowed before pedestrian crossing controls 
should be installed. This example points out how critical the street width is to the 
problem of pedestrian crossings, and indicates how important crossing controls 


for school children must be for heavily traveled highways. This also means that the _ 


wider streets involve much greater hazard to pedestrians. The.above method has 
been adopted by the Joint Committee of the Institute of Traffic Engineers and the 
International Association of Chiefs of Police (Gerlough, 1955). 


There are also problems in which both the Bernoulli and Poisson models 
are useful, as illustrated in the following. 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 119 


Circle ` in e Sin. Hole 


X= Spacing Of Drill Hole 


Figure E3.17 . 


EXAMPLE 3.17 ' TY T 


Suppose that the soil deposit iri a given region contains 0.25 % boulders by volume. 
What is the probability that a 50-ft-deep 3-in.-diameter boring will encounter 
boulders? Assume hypothetically that boulders are 12-in.-diameter spheres. 


Solution 


Assurne that boulders are randomly located in the soil mass, and the presence of 
boulders may be modeled by tlie Poisson process; the probability of n boulders in a 
soil volume V is; therefore, Us i SONS ! 


n he 
PIN =n) = a ry e—0.0025V /v 
m n! v y} K 
where v = volume of one boulder, given by 
l -Zd =Z cu ft 
" v ro ; eu n . 
Thus ` ae 
see D ^d H E) o e " TE EIk i E 
sebo qns 1 PUN = 1) = d (09023 PV s torino 
es i MIN m6] e 


D 


- 10.00477 V yer 9.00477 > "m 


Next, suppose that a boulder will be encountered whenever the drill hole touches 
the circumference of the boulder. Then, if the borings.are spaced x ft apart (see Fig. 
E3.17), we determine the probability of an encounter. (or hit) per foot of boring 
depth as follows. x T ` 

For a 3-in. drill hole, any boulder with its center inside the 15-in. circle as shown 
in Fig. E3.17 will be hit by the boring. Therefore, if there is one boulder in the 15-in, 
Strip, the probability of hitting it per foot of boring depth is 
area of 15-in. circle 
area of 15-in. strip 
.GÍAY05/12* — 0.982 
. Q512x x 


However, there may be any number of, boulders in the 15-in. strip. According to 
the Poisson model, the probability of n boulders in the strip of length x (with 


P (hit per ft boring | 1 boulder) = 


statistically independent of other hits, we have 


120 ` ANALYTICAL MODELS OF RANDOM PHENOMENA 


volume V = (15/12)x per ft depth) is 


P (à boulders in strip) = [0.00477 (2x) |" exp | -0.00477(15 
Pa 12 Xp z 
= 1.0.0596: ntes 


Also, if the probability of hitting any boulder within the strip is the same and 


. n 
' ' P (no hit per ft boring | n boulders) = ( z ed 
Then, applying the total probability theorem, 
A n 2/09820 1 jg - 0.00596. 
q = P (no hit per ft boring) = > [1 — — + 5 (0.00596x) 0.005002 
: ae) oni 


= g- 0-005965: , £0.005962(1—0.982/2) 
f = g--00596(0.982) 
and - g i : 
p= P (hit per ft boring) = 1 — e-9.00596(0.982) — ] — e 9.00585 
| œ 0,00585 
Then, assuming the 50 ft of boring to be a Bernoulli sequence with p. 0.00585 per 
ft of boring, we obtain the required probability . 
P (hit in 50 ft boring) = 1 — (i — 9? = 1 — (0.99415) _ 
='0.254 


` (See Problem 3.45 for an alternative approach: to this problem.) 


Before leaving the Bernoulli and Poisson models, we should point out for 
emphasis that in both processes the occurrences of an event between trials 
(in the ease of the Bernoulli) and between intervals (in the case of the 
Poisson model) are statistically independent. More generally, the occurrence 
of a given event in one trial (or interval) may affect the occurrence or non- 

- occurrence of the-same event in subsequent trials (or intervals). In other 
words, the probability of occurrence of an event in a.given trial may, in 
general; depend on what happened at earlier trials, and thus is a conditional 
probability. If this conditional probability depends only on the immediately 
preceding trial (or interval), the resulting model is called a Markov chain 
.(or Markov process). The elements and applications of Markov chains are 
developed in Vol. II. 


3.2.7. The exponential distribution 


The exponential distribution (also known as the -negative exponential) is 
related ‘to the Poisson process as follows. If events occur according to a 
Poisson process, then the time T; till the first occurrence of the event has 


$2. USEFUL PROBABILITY DISTRIBUTIONS 121 ` 
an exponential distribution. We observe that (T, > t) means that no 
event occurs in time t; hence, according to Eq. 3.39, 

- P(T, >) = P(X. =0) = e” 


T; is the first occurrence time.in a Poisson process. However, since the oc- 

currences of an event in nonoverlapping time intervals in a Poisson process 

are statistically independent, T, is also the recurrence time or the time be- 
n Wo on IU 0 u 0 " j 


The distribution function of T4, therefore, i 
l Fe) = P(T, <i) = 1- e” (3.40a) 
and its density function is 


dF 


fr) = mune" t20 l (3.40b) 


If v is constant (independent of 1), the mean value of T; is (see Example 


_ 8.5) 


(3.41) 


1 
un = — 
y. 


which means that the mean recurrence time or return period for a simple 


Poisson process is 1/v.. This should be compared with the return period of 
1/p for the Bernoulli model. However, for events with small occurrence 
tate v; 1/» œ 1/p. To show this, we observe that in a Poisson process with 
mean occurrence rate v, the probability of an event. occurring in a unit time 
interval is p = ve” = »(1 — » +$»? + +++); thus, for small », p œ v. 


EXAMPLE 3.18 
Historical records of earthquakes in San Francisco, California, show that during 


the period 1836-1961 (see Benjamin, 1968), there we: i i 
` b. re 16 earthquakes of intensit 
VJ or more. If the occurrence of such high-intensity earthquakes in this region M 


-assumed to follow a Poisson process, what is the probability that such earthquakes 


will occur within the next 2 years? 


16 


y= 
125 


= 0.128 quake per year 
Then : 
P(T, < 2) = 1 — e080) — 0226 


The ili ; high i ee EIE 
ean e ability that no earthquake of this high intensity will occur in the next 10 


x. P(Ty > 10) = e71010-128) — 0.278 


The return period of an intensity-VI earthquake in San Francisco, according to. 


122 ANALYTICAL MODELS OF RANDOM PHENOMENA 


P(n51) 


10 "i5 20 
5 years 


PHR E3.18 Probabilities of high- intensity earthquakes in San Francisco, 
rom ; 


Eq; 341, therefore, is 


meaning thatan earthquake of at least intensity VI can be expected, ‘on the average, 
once in every 7.8 years in San Francisco (assuming that the Poisson process is a 
reasonable model for the occurrence of high-intensity earthquakes in the area). 


More generally, the probabilities of the occurrences of such earthquakes within a ' 


given time ¢ is given by Eq. 3.40a; in' the present case, this is 
o PTS) = 1 — Ot 
which i is portrayed graphically in Fig. E3.18. | — : 


Figure 3.9 PDF and CDF of the exponential distribution 


3.2. USEFUL PROBABILITY DISTRIBUTIONS" 123 


In particular, the probability of high-intensity earthquakes occurring within the 
return period of 7.8 years is . 


P(T, £78) = 1 — ectamaa 
=1 — e0 = 0.632 


In fact, for a Poisson process, the probability of an event occurring 
(once or more) within its return period is always 1 — eT = 1 — o! = 
0.632. Compare this with the corresponding probability for large return 
period of the Bernoulli model (Section 3.2.3). 

The exponential distribution is useful also as a general- -purpose probabil- 
ity function. In general, its density function can be given as 


Ae MO vz-0 (3:422) 
=0 z«0 l 
where ) is a constant parameter. The corresponding distribution function is 
Fx(z) 21— e z20 
=0 z«0 (3.425) 
Graphically, the PDF and CDF of the exponential function would appear 
as shown in F'ig. 3.9. 


Shifted exponential distribution. In Eq. 8.42 the density function 
starts at the origin z = 0. The PDF of the exponential distribution, how- 
ever, can start at any value; the resulting distribution may be called the 
shifted exponential distribution, with the corresponding PDF and CDF 


fe OF Fy 


-> 
x 


Figure 3.10 PDF and CDF of the shifted exponential distribution `. 


124 ANALYTICAL MODELS OF RANDOM PHENOMENA 


_ given as follows: 


fx(z) = Ag-Me-a sza (8.43a) 
= “<a 
and 
F(a) -1—69€9 gd 


=0 ` “<a 


Graphically, these functions Would appear as shown in Fig. 3.10. 

The exponential distribution may be derived also from other considera- 
tions; that is, other than as a consequence of the Poisson process, as de- 
scribed earlier. In particular, this distribution arises, in the theory of reli- 
ability (see Vol. II), as the model for the distribution ‘of life or time-to- 
failure of systems under “chance” failure condition. In this connection, the 
parameter A is related to the mean life. or mean time-to-failure E (T) as 


1 
E eS 
(m =5 
See Example 3.5. 
EXAMPLE 3. 19 


Suppose that four identical diesel engines are used as prime movers to generate , 


backup electrical power for the emergency control system of a nuclear power plant. 
Assume that at least two units are required to supply the needed emergency, power; 
in other words, at least two engines must start automatically during an emergency, 
otherwise this backup system will not be able to deliver the needed power. The 
operational life T of each engine has an exponential distribution, with a rated mean 
operational life of 15 years. 

Determine the reliability of the emergency backup “system for a period of two 
years; that is, what is the probability that at least two of the four engines will start 
automatically during any emergency within the first two years of the life of the sys- 
tem? 

First, we observe that for each engine, the probability bat there will be no 
failure to start in.two years is 


P(T > 2) = 615 = 0, 875 


Then, denoting N as the number of reliable engines, the reliability of the backup 
system in two years is 


PIN >2) = PX (0.875 (0.125) 
n=2\n 
= 0.993 ' 


3.2.8. The gamma distribution 


If the occurrences of an event: constituté a Poisson process, then the time 
until the kth occurrence of the event is described by the gamma probability 


(3.43) 


A 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 125 


distribution. Let T, denote the time till the kth event; then (T, < t) means 
that k or more events occur in time ¢. Hence, on the basis of Eq. 3.39, the 
distribution function of T, is 


Fa) = E P(X = 2) 
mk 
KA (pt)? LY 
=1- De (3.440) 
=o T: i 


Equation 3.44a may be obtained also by observing that n >t) means 
there are at'most (k — 1) events occurring within. ` 
The corresponding density function, therefore, is 

- p(t) n iy 
= ———_ e” t> 
PALO (k — 1)! e 20 
The gamma distribution (with integer k) is known also as the Erlang 
distribution. The mean time til the oceurrenee of the kth event is 
E(T,) = k/v, with variance Var(T,) = k/v*. 


(3.445) 


EXAMPLE 3.20 


Suppose that fatal accidents‘on a particular highway occur on the average about 
once every 6 months. If the occurrence of accidents on this road constitutes a 
Poisson process, the time-till the first accident is given by the exponential law with 
ved accident per month; that is, 


fr T.) = Tw 
"The time till the second accident is described by the gamma distribution, or 
l 3 aft). X 
= 4f tieu 
fa = z(a) 
whereas the time till the third accident would be 
(31 

fe d E " HORE etie 
The foregoing density. functions are shown graphically in. Fig. E3; :20. The corre- 


spohding mean times are 6, 12, and 18 months, respectively; 


Tt may be recognized that the exponential and gamma ; distributions are 
the continuous analogues, respectively, of the geometric and negative". 
binomial distributions, in the sense that the exponential and gamma: dis- 
tributions’ are related to the Poisson process in the same way that the 


126 ANALYTICAL MODELS OF RANDOM PHENOMENA 


frit) 


ost. months.” orf 


Figure E3.20 “PDF for time till the 1st,-2nd and 3rd accidents on a highway 


geometric and negative binomial distributions are related to the Bernoulli 
sequence. .. xL 

The gamma distribution is useful also as a general-purpose, probability 
distribution. For such. purposes; however, it: is usually given in a more 
general form. 

We recall that the generalization of the factorial to noninteger numbersi is 
the gamma function, 


T(k) = f aer di k»0 (8.45). 
o 


which we can show that integration-by-parts yields, for k > 0, 
T(k) = (k — 1)T (ke — 1) 


Therefore Eq. 3.440 can be generalized for a random variable X by-replac- 
ing (k — 1)! with the gamma function. Thus, in general, the gamma density 
function is 


k. , 
€" zx0 (00 G40) 


where v and k are parameters. Its mean and variance remain k/» and k/y?, 


'respectively. 
Calculation of probability involving the gamma distribution ean be per- 


formed using tables of incomplete gamma function. Incomplete gamma 


3:2.. USEFUL PROBABILITY DISTRIBUTIONS 127. 


funetions are usually tabulated for the ratio (for example, Harter, 1963) 


NES — dy 
HODES TH) 
Then, if X has a gamma, distribution we obtain fora > 0 and b 2 0; 
F : y Sad i 
PGxXX9 gg 270 
Letting y = vx, this integral becomes 
- va 
< - k-ie-y dy — kte d | 
P(a<X <b) lf v e^? dy d e dy 


= 1(vb, k) — I(va, k) 


In effect, therefore, the incomplete gamma Junction ratio is the CDF of the 
gamma distribution (with » = 1). 


3.2.9. The hypergeometric distribution 


` The hypergeometric distribution arises when samples from a finite population. ` 


(consisting of two types of elements, for example, "good" and "bad") 
are being examined. It is the basic distribution underlying many sampling 
plans used in connection wi ith acceptance sampling and quality control (see 
Chapter 9). 


Consider a lot of N items, m.of which. are defective ‘and the remaining 
(N — m) are good. If a sample of n items is taken (at random) from this 
lot, the probability of x defective items in the simple is given by the hyper- 
geometric distribution 


Pa =z) : HM (3.47) 
a i o CC RO o s 


In the lot, the number of samples of size n is (%); among. these, there are 
(2) ("=") samples with z defectives? Hence, assuming that the samples are 
equally likely to'be chosen, we obtaih Eq. 3.47. 


anaes tutt aan RU UP RAIYDUM FHBINUMENA 


EXAMPLE 3.21 


A box contains 25 strain gages, and 4 of them are known to be defective gages. 


If 6 gages were -used in an experiment, what is the probability that there was 
one defective gage in the experiment? 


In this case, N + 25, m = 4, and n = 6. Hence the required probability is | 


P(X == QU) | 
| (9 


The probability that none of the defective gages were used in the experiment is 


= 0.46 


Ca 
P(X = 0) E) = 0.31 


(3) 


EXAMPLE 3.22 


Suppose that 100 concrete cylinders are to be taken daily at a large construction 
project. To ensure quality, the acceptance criterion requires that 10 of these cylinders 
(chosen at random) must be tested and at least 9 of these must have a specified 
minimum crushing strength. What can we say about the acceptance criterion 

. is it too stringent? Y 
: Whether the acceptance criterion is too stringent, or not stringent enough, 
depends on whether it is difficult or easy for poor-quality material to go undetected. 
For example, if there is d percent of defective concrete, then on the basis of the 
Specified acceptance criterion, the probability of the daily concrete mixes’ being 
; rejected is (denoting X'as the number of defective cylinders) i 


PX > 1) =1- PX x1) 
(ie 100(1' — 4) 
QU?) 


oe = ”) 
100 
(s) 
For example, ifd =5%, 
95 
9 


1- + 
EE p (e ( 
P(rejection) =] — Ax + Ne 
( ia) ( 10 


= L — (0.5837. + 0.0034) ` 
= 0.4129 


i 


3.3. USEFUL PROBABILITY DISTRIBUTIONS 129 


(5) , ()() 
men 10 17149 
P(rejection) =1- 7O u E 
10 A I0 
= | — (0.8091 + 0.1818) 
= 0.0091 . 
Therefore, according to these calculations, if 597 of the concrete is defective, it is 
likely (with 41% probability) to be discovered with the proposed acceptance 
criterion, whereas if there is only 2% defectives, the likelihood that the material 
will be rejected is almost nil (0.975 probability). . 
Hence, if the contract requires concrete with less than 2% defectives then the 
acceptance criterion is not stringent enough; on the other hand, if material with as 
muchas 5 % defectives is acceptable, then the proposed criterion may be satisfactory. 


3.2.10. The beta distribution 

A: probability distribution appropriate for a random variable whose values 
are bounded, say between finite limits a and b, is the beta distribution. The 
density function of such a distribution is 


whereas, if d = 2%, 


1. (&—a)M(b—zy 
Bir) | (b—-oe 


=0 m elsewhere 


fe) - : aac : (3.48) 


‘in which g and r are parameters of the distribution, and,B (q, r) is the beta 
function ` ` 


i E À 
Blg r) = f z1(1 — a) dz (3.49) 
an . i 
which is related, to the gamn a function as follows: 
(s à LL KOTC) . 1 
B(g,r) = 3.49a 
(uer (3.408) 


Depending on the parameters g and r, the density function of the beta 
distribution will have different shape. Figure 3.11 shows the density func- 
tion between 2 and 12 with g. = 2.0 snd r = 6.0. - i 

If the values of the variate are limited between 0 and 1.0 (that is, a = 0 
and b = 1.0), Eq. 3.48 becomes 


1 
B(g;) 


=0 M .. elsewhere 


fx(a) = zC(1—2)32  0£z«&10 (3.482) 


430 ANALYTICAL MODELS OF RANDOM PHENQMENA 


feo ^ 


o2 


q-20;r*60 


i 


Figure 3.11 A beta distribution 


which can be called the standard beta distribution. Figure 3.12 shows the 
standard beta density function with different values of g and r. 

The probability associated with a beta distribution can be evaluated in 
terms of the incomplete beta function, which is defined as 


Big, r)- f y1 — y)dy  0«s«10 (3.50) 
0 " 


Figure 3.1 2 Standard beta PDF 


3.2. USEFUL PROBABILITY DISTRIBUTIONS 131- 


For example, to evaluate the probability between + = x and s = a, we 
have ; / 


z2 4) (z—a)ye(b- sz) 
P(x, < X < x) =f Ban (b = qr 


Let 
2t€£78 
ior eT 
so that 
dz : b—z 
dy = -y) = 
y b—a and (1 — y) DER 


With this change of variable, the preceding integral can be shown to be 
P(a<X Ez) 


1 (z2—a) /(b—-a) (21-4) /(b~a) 
-xaxl/ yc -vu f yd -»«] 
» 


We recognize that each of the last two integrals is an incomplete beta func- 
tion, B,(g, r) and B.(q, r), respectively, where u = (zz — a)/(b — a) and 
v = (zy — a)/(6 — a). Thus the E PPM. is ` 


P(n < X <n) = [Bulg r) — Boa r] (8.51) 


Fa B(gr) 
Values of the incomplete beta function ratio [B..(q, r) |/ LB (q, ).] have been 
tabulated; for example, by Pearson (1934), and Pearson and Johnson 
(1968). ‘Therefore, probabilities involving the beta distribution can be 
evaluated convenientlyusing tables of the incomplete beta function ratio. 
In fact, by virtue of Eq. 3.50, we also observe that the CDF of the standard ` 
beta distribution, Eq. 3.482, with parameters q and r, is given by 


selan -FER | G5) 


Effectively, therefore, tables of the incomplete beta function ratio are also 
the tables for the CDF of the standard beta distribution. 
The mean and variance of the a distribution, Eq. 3.48, are 


7 & — a) i (3.52) 


mars 


ox? = 


GF marra 9 y 


2(r — q) 


~ G4 4r + Dex (3-59) 


It can be observed that the skewness of the beta distribution is positive 


when g < 7, and negative when g > r, whereas when g = r the distribution 
is symmetrical (6 = 0) about the mean value, as illustrated in Fig. 3.12. 


Therefore, with suitable choice of the parameters g and r, the beta distribu 


tion may be used to fit a wide variety of shapes of frequency diagrams. 


EXAMPLE 3.23 


The duration of an activity in a construction project bas been estimated by the 
contractor to be as follows: 


minimum duration = 5 days 
maximum duration = 10 days 
expected duration = 7 days 


` 


The contractor also estimated the coefficient of variation of the duration to be 10%. 
Determine the beta distribution for the duration T of the activity. 

It is obvious that a and b will be 5: and 10 days, respectively. By equating the 
expression for the-mean value, we have 


ht 
giving q = 2r[3. Substituting this into the expression for the variance, 


Q3 TTD (10 -5*- (0.1 x TF 


Thus obtaining g. = 3.26 andr = 4.89. Thé appropriate beta distribution, therefore, 
has parameters. g = 3.26 and r = 4.89. 

The probability that this activity will be completed within 9 days is given by 

B, (3.26, 4.89) 
B(3.26, 4.89) 


with u = (9 — - 5y00- = 5) = 0.8: From tables of the incomplete beta function 


Pr s9 = 


=1 = 0.008  - 
-092 t 


3.2.11. Other distributions 


The probability distributions described thus far are 
and important. However, these are not inclusive; Í 
other distributions may prove to be more appropria 
the triangular and uniform distributions. Among otl 


‘butions are the t-distribution, the chi-square (x 


distribution, and the Pearson system (Elderton, 19 
important in statistical analysis; for example, the 
for determining the confidence interval of the pop 
known variance, whereas the chi-square distribution 
estimation of the population variance (see Chapter 

Another group of probability distributions of 
engineering design is the distribution of extreme 
distributions are presented in Vol. II, with special re 
sociated with extreme natural hazards. 


3.3. MULTIPLE RANDOM VARIABLES 


The concept of a random variable and its probabil 
extended to two or more random variables. In order 
events that are the results of two or more physics 
in a sample space may be mapped into two (or m 
real space; implicitly this requires two or more ran 

Consider, for example, the rainfall intensity a. 
resulting runoff of a river; we may use 2 random va 
denote the values of the measured rainfall intensity 
random variable Y whose values y are the possible 1 


* Tables of the incomplete beta function ratio are usually g 
the ratio is i 
Belar) ,.B-G9 
B, 7) Bir, à) 


139 ANALYTICAL MODELS OF RANDOM PHENOMENA 


cordingly, (X = z, Y = y) and (X < € y) are joint events* defined 
by values of the random variables in da aot. Obviously, this notion 
can be extended to multiple random variables. 


3.3.1, Joint and conditional probability distributions 


: Since values of X and Y represent events, there are probabilities associated 


with any pair of values z and y; the probabilities for all possible pairs of z 


and y-may be described with the joint distribution function of the random 


variables X and Y, defined as 
Fry(my) = P(X S z, Y Sy) |. (256) 
which is the cumulative probability of "S joint occurrences of the eyents 
identified by. X <x and Y < y. In order to comply: with-the axioms of 
probability, the joint distribution function must satisfy the following: 
(a) Fxx(—9,—9) =0;  Pxy(, ©) = 10 
(b) Fxy(—,y) = 0; Fx,y(9,y) = Fr(y) 
Fx,y(z, —) = 0; Fx y(2, ©) = Fx(z) 
(c) Fx,y(z, y) is nonnegative, and a nondecreasing function 
of z and y 


‘If the random variables X and Y are discrete, the probability distribution: 
may also be described: with the joint probability mass function (PMF), 
Mr is simply 


xx(y-P(X-mY-y C (3.57) 
Then the distribution funetion becomes 
Px rlz, y) = 3D prasy) ^B (3.58) 
g [zi <z, vj Sy} 


which is simply the sum of probabilities associated with all point pairs 
(zs, y;) in the subset (z; < z, y; € y}: i 
The probability of (X = z) may depend on the values of Y, or vice 


3 


versa; accordingly, by virtue of Eq. 2.11, we have the conditional probability 


mass function 


pay(e |y) = P(X = £| Y =y) = pex (3.59) _ 


- Pr(y) 
* We will use the notation: - , 
(0 Æ =a Y=) = KX = an = yl 
AX Sz Yssen «yl 


if py(y) = 0. Similarly, 


3.3. MULTIPLE RANDOM VARIABLES 135 


prix (y lz) = Eur " i (3.59a) 
if isis) z 0. : 

The PMF of the individual don variables-may be-obtained from the 
joint PMF; applying the theorem of total probability, Eq. 2.19, we obtain 
the marginal PMF of X as 


px(z)'= 2) P(X—s|Y-y)P(Y-w) 


all yj 
= È P(X =s, Y =y;) 
all yj : 
= $, pzs y) . (3.60) 
all yj m 
By the same token, : s 
prly) =. DX per (zy y) (3.602) 


allzi 
If the random variables X and Y are statistically independent (meaning 
that the events X = z and Y = y are statistically independent), 
psx(z|y) = px(2) and — pyix(y| 2) = pr(y) 


Hence, Eq. 3.57 becomes Pus : f 
Pxr(mg) = px(®) prl) — (3.61) 


EXAMPLE 32A A l M 


Suppose that, from a survey of construction labor; the € disation (in number 
of hours) per day and the average productivity (in terms of percent efficiency) 


Duration , hours 
Figure E3.24a Jot PMF px,y(@, y) 


i 
i 
I 
i 


6, 50 2 0.014 
6, 70 5 0.036 
6, 90 10 0.072 
8, 50 5 0.036 
8, 70 30 | 0.216 © 
8, 90 25 0.180 
10, 50 ; 8 0.058 
10, 70 25 . 0.180 
10, 90 1 0.079 
12, 50 10 0.072 
12, 70 6 0.043 
12, 90 ' 2 0.014 


Total = 139 


These may be portrayed graphically as.shown in Fig. E3.24a. 
The marginal PMF for X, the distribution of work duration, is 
PxCO— © —Px.05yD- 


(ug 60710,90) 
and would appear as shown in Fig. E3.245. For example, the ordinate at X = 8 is 
obtained as : i 
Px(8) = 0.036 + 0.216 + 0.180 = 0.432 

Similarly, the marginal PMF for Y, representing the distribution of productivity, 
is shown in Fig. E3.24c. 

If the work duration per day is 8 hr, the probability that the average productivity 
will be 90% is given by the conditional probability of Eq. 3.59a as 


o; | gy = Px.x (8, 90%) 
Prix(90% | 8) = oar md 
_ 9.180 
~ 0432 
. = 0.417 


of RBL 
ol— 422 
Oe oas2 
5 0317 

D 0.129 

zx 


Figure E3.24b Marginal PMF px(z) 


kagi 0345 


a3] 
ol 0.180 02] 
Q| 
© EJ 70 90 Vh 9 
Figure  E3.24c Marginal PMF Figure E3. 
Py(y) Prix (yi8) 


If the random variables X and Y are continuou: 
bution may also be described with the joint proi 
"(PDF), which may be defined as 


Sx. (a, y) dz dy = Piz« X & z- dz, y «1 
Then 


Fx. (z, y) = f EZO v) 


Conversely, if the partial derivatives exist, 


Fx v(z, y) 


fv(ny)- DT 


Also, 
b d 
Pa«X&tc«Y«d - f f f 
which is the volume under the surface f(x, y) as si 
Analogous to Eq. 3.59, the conditional density fu 
fans y) 
ép igiene 
Therefore, in general, 
fx.y(z, y) = fxiv | y) Sey) 


or | 


fan y) = frix (y | 2) fe Go) 


However, if.X and Y are statistically independen 


Figure 3.13 Joint PDF of X and Y 


Jele) and fric (y | 2) = fr(y), then 


fry y) = fx(2) fry) (3.68) 


Applying the total probability theorem, we obtain the marginal density 
functions, 


fela) = f feo) fr) dy 


- f 7 fear(a, y) dy (3.69) 


and, similarly, 


frly)'= f. fx.v(z, y) dx (3.70) 
The characteristics of a joint density function for two random variables 


X and Y, and the associated marginal densities, are portrayed i in Fig. 3.14. 


EXAMPLE 3.25 


An example of a joint density function of two continuous rendom variables Xx 
and Y is the bivariate normal density function given by 


EN 1c <1 {/x— ny? 
da aes ces eruere gr 


2S) (5) +E] 


—90 «x«0; —o <y < ©; 


in which p is the correlation coefficient between X and Y (see Section 3.3.2). Such i 


Figure 3.14 Joint and marginal PDF of continuous : 


a function can be written also as 
: lx — "x - 
— ESE 


ly — By — FC 
a| 2 Srv 


Then, i in view of Eq. 3.67, we see that the conditional 
X =x is, 


whereas the marginal dedi function of X is 


1 1/x -— 

= exp| —={—— 

ao Vasa ef 2 23 

both. ‘of which are Gaussian. In particular, the con 
normal with mean vahie 


E(Y|X = x) = ay — plex 


Var (Y |X =x) = oy — 


(ae 


and variance” ` 


140 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Similarly, it can be shown that _. 


= 1 1fx — py — pl = 2 
fuel = Tee A aT 


and : a" e 
dose] 


9.3.2. Covariance and correlation 


The j oint second moment of X and Y is i: 


E(XY) = f. LE xy fx.y (x,y) dz dy (3.71) 


and if X and Y isti i , " 
of Ba. 868) are statistically independent, Eq. 3.71 becomes (by virtue 


gam = f i ” ule (2)fe(y) dz dy 


= [st as fuso dy = EOD BY) — (oio) 
The joi n f : P: : 
= a n Wt moment about the menns i and uy is the covariance of X 
i Cav(X, Y)-E[X-—u)Y-nu)] > 
= E(XY) — E(X) E(Y) (3.72) 


In view of Eq. 3.7 )20i + ps ; 
bën de q. 371a, Cov(X, Y) = 0 if X and Y are statistically inde- 


The physical significance of the covariance can be inferred from, Eq. 


` 8.72. If the Cov(X, Y) is large and positive, the values of X and Y tend to 


a both large or both small relative to their respective means, whereas if the 
in ; Y) is large and negative, the values of X tend to be large when the 
values of Y are small, and vice versa, relative to their respective means; 
"ra 7 a e ; 2 is small or zero, there is little or no (linear) relation- 

etween the val i i ip exists, it i 
are vi lues of X and Y (orif a strong relationship exists, it is 
"Therefore, the Cov(X, Y) is a measure i 

1 à of the degree of (linear) inter- 
cd rae the variates X and Y. For this purpose, ee it is 
preferable to use the normalized covari jon coefficient, whi 

M normalized covariance or correlation coefficient, Which 


p- COM, Y) | 


oxoy 


(3.73) 


-which we can verify as follows. 


3.3.. MULTIPLE RANDOM VARIABLES 141 


The values of p range between —1 and +1; that is, 
-l1spt&-i (3.74) 


According to Schwarz's inequality (Hardy, Littlewood, Polya, 1959), 


Uu E (£ — ux) (y — py) fxr (a, y) dz av] 


—e/- 


x f^ [^ m ies dedy: f f. (i uten) dedy 


‘But the left-hand side is the [Cov(X; Y) 7, whereas 


C e m nort à dedy = f” 7 xf de = o 
and l l 
ja IB (y — ux)*fxv (2, y) de dy - fe (y — wy) fry) dy = oy? 


Hence we have s 
| [Cov(X, Y)] X exte! _ (8.75) 
P10 (8.75a) 

thus verifying Eq. 3.74. Ü me oe 

When p —:2-1.0, X and Y are linearly related as shown in Figs. 3.15 
and 3.15b, respectively, whereas, when p = 0, values of X and Y may ap- 
pear as in Fig. 3.15c. For intermediate values of p, values of X and Y would 
appear as in Fig. 3.15d—the “scatter” decreases as p increases. However, we 
also observe from Figs. 3.15e and 3.15f that when the relation between X 
and Y is nonlinear, p — 0 even when there is a perfect functional relation- 
ship between the variables. . 

Therefore the magnitude of the correlation coefficient p (between 0 and 
1) is.a measure of the degree of linear jnterrelationship between two 
variables. E f 


`. -Tt is also important to point out that although p is a measure of the de- 
gree of (linear) relationship between two variables, this does not neces- 


sarily imply a causal effect between the variables. Two variables X and Y 
inay both depend.on another variable (or variables), in which case there : 
will be a strong correlation between the values of X and Y, but the values 
of one variable may have no direct effect on the values of the other. For 


: example, the flood flow of a river and the productivity of a construction 


144 ANALYTICAL MODELS OF RANDOM PHENOMENA 


(0) p=+l0 (b) p=-ho 


"ew yp 


Figure 3.15 Significance of correlation coefficient p. (a) p = +1.0. Q) p= -1 0. 
Op = 0. ae <p < 1.0. (e)p =0. Oe = 0. 


crew may, be highly correlated because both depend on the w -eather condi- 


tion; however, the flood flow may have no direct influence’ on the produc- ` 
tivity of the construction crew, or vice versa. Consider also the following 


problem from mechanics. 


EXAMPLE 3.26 .. 


A cantilever beam is subjected to two random loads S, and S; (Fig. E3. 26), which 


are statistically independent with means and standard deviations 4, «4 and Pn oz 
respectively. 


4d. MULIIFLIS NANVUM .V AREABLES ane 


Sı 82 


Figure E3.26 


The shear force Q and bending moment M at the fixed support are 
Q-5 +S, 


and ] 
M = aS, + 2aS, 


which are also random variables with means and variances a8 follows (see Section 
4.3.2) 
Hg = Hy Ha KA = a? + o? 
My = a, 2a, omy? = d (n + 40,7) 
Although S, and S are statistically independent, Q and M will be correlated; - 
this correlation can be evaluated as follows: 
E(QM) = EI(S, + S2)(aS, + 2453)] 
= aE ^ + 3aE(S,8;) + 2aE(S,*) 
but E(S,S,) = - ES) ES) by.Eq. 3. 71a, and E(Sj?) = o? + mè, ESD = o + 
#2; thus : 
E E(QM) = a(o? + m?) + 2af? + ng) + 3aans 


= ao? + 20,7) + hou 
Cov (Q, M) = EOM) — houw 
= alo? + 2o) 
and the corresponding correlation coefficient is 
; Cov (Q, M) = oy? + 20,2 


p = oS ee 
QM D DE y (0,2 + o,?)(o,? 4 40,2) 


Hence, if o, = 04, 


Therefore 


gi 
pom = yg = 0948 


indicating strong correlation between.the shear and moment at the support. This 
correlation arises because Q and M are functions of the same loads ‘Sy and Sz; 
however, there is no causal relation between Q and M. 


3.3.8. Conditional mean and variance* 


If there are two random variables, the mean and varianee of one variable 
may depend. on the value of the other variable; in such cases we have 
* This section may be skipped over on first reading; the material is not needed for under- 
standing the remaining chapters of the book. P 


144 ANALYTICAL MODELS OF RANDOM PHENOMENA 


conditional means and conditional variances. Indeed, it would be meaningful 
to speak of conditional moments of any order. 


If X and Y are discrete random variables with joint PMF px,r(2, y); 
the conditional mean of X, given Y = y, is 


uxw = E(X|Y =y) = © zpxiv(z |y) (3.76) 
als 
and if X and Y are statistically independent, that is, pxir (z | y) = px(a), 
then , 
E(X|Y- » = E(X) (3.77) 
From Eqs. 3.59 and 3.60, we can write 


E(X)- xm tpx(2) = x DX px y (2, y) 


ally all z 
= » x zpxyy (2 | y) pv(y) 
ally allg 
Thus, substituting Eq: 3.76, 
E(X) = 3) E(X |Y = y) pry) - (8.78) 


-ally 


If X and Y are continuous random variables, the SEPIUS mean of X, 
„given, Y = y, becomes. : 


EXI = j à afery (e | y) de (3.76a) | 


and the relationship in Eq. 3.78 becomes 


ig = (i uxisfv(y) dy. (3:780) 


We should emphasize that whereas E(X|Y = y) isa constant, 
E(X | Y) is a random variable whose mean is 


ExLE(X|Y)] = 25 E(X |Y = y)pv(y) 


ally 


2 È tpxir(2 | jp) 


ally alls 


DX spx(s) = E(X) (8.79) 


all æ 


The subscript fonE emphasizes that the expectation is with respect to Y 
The conditional variance of X, given Y, is 


Va(X|Y 2yy- EX — sx)? LY =] © (8:80) 


3.4. CONCLUDING REMARKS 145 


` Thus, for discrete X and Y, . . 
Var(X | Y =y) = © (2 c axr) prir (2| y) (3.800) 


allz 


and for continuous X and Y, 


Var(X|¥ 24) = f^ @—asw)Y¥nr(@ly) de (8.808) 


- The total (unconditional) variance ean be expanded as follows: 
Var(X) = EL(X — n3?] = Ev( EL[CX — px)? | YJ}- 
The last equality follows from Eq. 3.79. This last term, however, is 
‘By EUX — uxiy)? | YJ} = By{BL(X — we + u — uxi)? | Y) 
= Bri ELE — m)? | Y] 
T 2E[(X — ui) (us — ixi) | Y] 
+ C(uxiy — 3? | YJ} 


Recognizing that the second term is zero, and Zv(uxiy) = ux oe to 
Eq. 3.79, we have € 


Var(X) = Ey[Var(X | ¥)] + VarsCE(X | Y)] (3. 81) 


Equation 3.81 says that the total variance is equal to the mean value of the 
conditional variance plus the variance of the conditional mean. 


3.4. CONCLUDING REMARKS < 


The principal concepts introduced in this chapter include the notions of a 
random variable and its associated probability distribution. Several of the 


- more useful probability distribution functions and their properties are also 


described and developed. However, the list of distributions is incomplete; 
a number of other important distributions were omitted including the 


' several extreme-value distributions that.will be presented in Vol. IT. 


The complete description of a random variable would be accomplished 
by specifyi ing its probability distribution (including the values of its param- 
eters). However, a random, variable may also be described approximately 
with its mean-value and variance (or standard deviation); physically, 
these main descriptors of a random variable represent its central value and 
measure of dispersion. For two (or more) random variables, the main de- 
scriptors mustinclude also the covariance or correlation coefficient between 
the variables. : 


146 ` ANALYTICAL MODELS OF RANDOM PHENOMENA 


Thus far (and this will continue through Chapter 4), we have been 
dealing with idealized theoretical models. In particular, we have assumed, 
tacitly at least, that the probability distribution of a random variable, 
or its main descriptors, are known. In a real problem, of course, these must 
be estimated and inferred or derived on the basis of real-world data and 
conditions. The concepts and methods for these purposes are the subjects 
of Chapters 5 to 8. 


PROBLEMS. 


Section 3.1 


3.4 A contractor is submitting 1 bids to 3 jobs, A, B, and C. The probabilities 
that he will win each of the three jobs are P(A) = 0.5, P(B) = 0.8, and P(C) = 
0.2, respectively, Assume events A, B, C are statistically independent. 
Let X be the total number of jobs the contractor will win. 
(a) What are the possible values of X? Compute and plot the probability 
mass function (PMF) of the random variable X. 
(b) Plot the distribution function of X. 
(c) Determine P(X < 2). Ans. 0.92." ‘ 
(d) Determine P(O < X <2). Ans, 0.84. 
3.2 The settlement of a structure has the probability density function shown in 
P32. 
i What is the probability that the settlement is less than 2 cm? : 
(b) What is the probability that the settlement is between 2 and 4 cm? 
(c) If the settlement is observed to be more than 2 cm, what is the probability 
that it will be less than 4 cm? 
3.3 The bearing capacity of the soil under a cotumn-footing foundation is known 
to vary between 6 and 15 kips/sq ft. Its probability density within this range 
is given as 


fx) -z -9) 6 «x «15 
' =0 elsewhere ` 


If the column i is designed to carry a load of 7.5 kipsisq ft, what is the prob- 
, ability of failure of the foundation? 


x, settiement in cm. 


Figure P3.2 Figure P3.4 


PROBLEMS 147 


3.4 The time duration of a force acting on a structure has been found to be a 
random variable having the density function shown in Fig. P3.4. 
(a) Determine the appropriate values of a and 5 for the density function. 
(b). Calculate the mean and median for the variable T. 
(c) Calculate the probability that T will be equal to or greater than 6 sec, 
that is P(T > 6). 
3.5 A construction project consisted of building a major bridge across a river 
and a road linking it to a city (Fig. P3.5a). The contractual time for the entire 
. project is 15 months, 
The contractor knows that the construction of the road will require between 
12 and 18 months, and the bridge could take. between 10 arid 20 months. 
The probability density functions of the respective completion times, how- - 
ever, are uniform for the road, and triangular for the. bridge, as shown in 
Figs. P3.56 and c. Construction of theroad and bridge can proceed simultane- 
ously, and the completion of the bridge and the road are statistically inde- 


pendent. 
Determine the probability of completing the project within the contractual 
time. 
Figure P3.5a 
Sn ^tt) 
o F L3 is [EST ° 


Figure P3.5b Figure P3.5c 


3.6 In order to repair the cracks that may exist in a 10-ft weld, a nondestructive 
testing (NDT) device is used first to detect the location of cracks. Because.’ 
‘cracks may exist in various shapes and sizes, the probability that a crack 
will be detected by the NDT device is only 0.8. Assume that the events of 
“each crack being detected are Statistically independent. 
(a) If there are two cracks in the weld, what is the probability that” ‘they 
would not be detected? . 7 
- (b) The actual number of cracks N in the weld i is not known. However, its- 


'148 ANALYTICAL MODELS OF RANDOM PHENOMENA . 


putt? 


fun 


f, Wolilng Tima, hours 
m, number of cracks 


MFigureP3ó ° > ^ . Figuré P3.8 PDF of waiting time 


PMF is given as in Fig. P3.6. What is the probability that the NDT 
device will fail to detect any crack in this weld? 
(c) Determine the mean, variance, and coefficient of variation of N based 
. on the PMF given in Fig. P3.6. 
(d) Xf the device fails to detect any crack in the weld, what is the probability 
that the weld is flawless (that is, no crack at all)? 


3.7 Suppose the duration (in months) of a construction job can be modeled as a 
` continuous random variable 7 whose cumulative distribution function (CDF) 


is given by ; : 
Fo) =0-2t+1 1<ts2 
-0 ` © <i 
-1 C "22 


- (a) Determine the corresponding density function fp(t). 
(b) Compute P(T » 1.5). et 
3.8 „The waiting time at airport 4 of city B has a density function shown in Fig. 
^ s P3.8. The waiting time is measured from the time a traveler enters the terminal 
to the time when he is airborne. ` . 
The travel time from hotel C to the airport. depends on the transportation 
" mode and may be assumed to be 0.75, 1.00, and 1,25 hours corresponding to 
travel by rapid transit, taxi, and limousine, respectively. The. probability 
of a traveler's taking each inode of:transportation is as follows: 


P (rapid transit) = 0,3 . 
P(tax) ^ | —04 
i P (limousine). = 0.3. 
(a) What is the probability that a trayeler will be airborne in.atmost 3 hr 
after leaving hotel C? Ans. 0.436. 


(b) Given that the traveler is airborne within 3 hr, what is the probability 
that he took the limousine? Ans. 0.234. ' UL 


39 Two reservoirs are located upstream of a town; the water is held back by two , 


dams A and B. Dam B is 40 m high. (See Fig. P3.9a.) During a strong-motion 
earthquake, dam A’ will suffer damage and water will fow downstream 
into the lower reservoir. Depending on the amount of water in the upper 


PROBLEMS 149 


Figure P3.9a 
iol ae Increase In Woter Level ` 
Ay) = y * Water Levet In al n Reservoir B 
iti g Reservoir B 


[J 25 EJ 


- yim) o 5 ©. 15 20 


Figure P3.9b Figure P3.9c 


reservoir when such an earthquake occurs, the Iower reservoir water may or 
may not overflow dam B. Suppose that the water level at reservoir B, during 
an earthquake, is either 25 m or 35 m, as shown in Fig. P3.95; and the 
increase in the elevation of water level in B caused by the additional water 
from reservoir A is a continuous random variable with the probability 
density function given in Fig. P3.9c. ` 
(a) Determine the value of a in Fig. P3.9c. ` 
(b) What is the probability ọf overflow at B during a strong-motion . 
earthquake? i à : 
(c) If there were no overflow at B during an earthquake, what is the prob- 
ability that the original water level in reservoir B is 25m? ` 


3.10 A stretch of an intercity freeway has 3 one-way lanes and 2 convertible 
lanes. The capacity of the highway when the’3 lanes are used is 100 cars per 
minute. Its capacity when 5 lanes are used is 140 cars per minute. 


c 
o 75 150 L ! o 100 


Cars per min, ae 


Cars per min, 


: (a) Normal Traffic (b) Heovy Troffic 


Figure P3.10 PDF of traffic volume. (a) Normal traffic. (b) Heavy traffic 


* 


150 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Three lanes of the freeway is used when there is normal traffic whereas all 
five lanes will be used whenever there is heavy traffic volume. The density 
function of the traffic volumes in each case are shown in Figs. P3.10a and b. 

` On a given day, if normal traffic is twice as likely as heavy traffic, what is 
thé probability that the capacity of the freeway will be surpassed? 


341 A traveler going from city A to city C must pass through city B (Fig. P3.11a). ` 


The quantities T, and T; are the times of travel from city A to city B and 
from city B to city C, in hours, respectively, which are statistically independent 
random variables. The probability mass functions of 7; and 7, are as shown in 
Figs. P3.i1b and c. The time required to go through city B may be consid- 
ered a deterministic quantity equal to 1 hr. oo 
(a) Calculate the mean, the variance, the standard deviation, and the 
coefficient of-variation of T}. 
(b) Determine the PMF of the total time of travel from city A to city C. 
Sketch your results graphically. : 
(c) What is the probability that the travel time from city A to city C will 
be at least 8 hr? . "à . 


B 

AC T 3 p 

Figure P3.11la 7 4 
«e |o. s 4 tp hours o 4 56 1o, hours 
"Figure P3.11b Figure P3.llc ` 


342 The hourly volume of traffic for a proposed highway is distributed as in 
Fig. P3.12. 
© The traffic engineer may design the highway capacity equal to the 
following. — .- x 7 
(i) The mode of X. 
" (ii) The méan of X. 
(iii) The median of X. g pt 
(iv) x.99, the 90-percentile value, which is defined as Fy(X.99) = 0.90. 


PROBLEMS 15l 


Determine the design capacity of the highwa: | i 
lin y and the corresponding 
. probability of exceedance (that is, capacity is less th: t 
for each of the four cases. diras VM tiam) 
(b) Assume that the actual capacity of the highway after it is built is either 
300 or 350 vehicles per hr with relative likelihoods of 1 to 4. What is the 
probability that the capacity will be exceeded? 


fyb) 


ie} 100 200 300 400 


- X, number of vehicles 


Figure P3.12 PDF of hourly traffic volume : 


3.13 The lateral resistance of n smali buildin fi aras isa vith i 
Peral fa building frame is mandom With the density 


^e 


" a i 3 eg i 
RO = sggr—-1000—5  10&r&20 
=0 ' elsewhere 


(a) rod lan density function fp(r) and the cumulative distribution function 
Rr). 


(b) Determine: 
(i) Mean value of R. 
(ii) Median of R. | . 
(iii) Modeof R) : ^" : 
(iv) Standard deviation of R.. Ans. V5, ` 
(v) Coefficient of variation of R. Ans. 0.149. 
- (vi) Skewness coefficient. Ans’ 0. 


3.14 The delay time of a construction project is described with a random variable x. 
Suppose that X is a discrete variate with probability máss function given in 
Table P.3.14a. The penalty for late completion of the project depends on tlie 


Table P3.14a. PMF of X Table P3.145. Penalty. function 
2X l AW (x) 
in days Px (days) : ($100,000) 
1 o5 — 1 5 
2 0.3 2 6 
3 0.1 3 7 
4 0.1 4 7 
PE ME 1 


152 ANALYTICAL MODELS OF RANDOM PHENOMENA 


number of days of delay; that is, penalty = g(x). The penalty function is 
given in Table P3.145 in units of $100,000. — "^ 

(a) Calculate the mean penalty for this project. Ans. $570,000. 

(b) Calculate fhe standard deviation of the penalty. Ans. $78,000. 


‘Section: 3.2 E 


3.15 If the annual precipitation X in a city is a normal variate with a mean of 50 in. 
and a coefficient of variation of 0.2, determine the following. 
(a) The standard deviation of X. 
© (b) P(X < 30). 
(c) P(X > 60). 
. (d) P40 < X < 55). n» et 
(e) Probability that X is within 5 in. from the mean annual precipitation. 
(£) Thé value x, such that the probability of the annual precipitation ex- 
ceeding x, is only 1/4 that of not exceeding xy. 


3.16 The present air.traffic volume at an airport (number of landings and takeoffs) 


during the peak hour is a normal variate with a mean of 200 and a standard 
deviation of 60 airplanes (Fig. P3.16). . g i 

(a) If the present runway capacity (for landings and takeoffs) is 350 p! anes 
per hr, what is currently the daily probability of air traffic congestion? 
Assume there is one peak hour daily. Ans, 0.0062. 

(b) If no additional airports or expansion is built, what would be the 
probability of congestion 10 years hence? Assume that the mean traffic 
volume is increasing linearly at 10% of current volume per year, and the 
coefficient of variation remains the same. Ans. 0.662. 


(c) If the projected growth is correct, what airport capacity will be required 


10 years from’ now to maintain the present service condition (that is, 
,.; .,the same probability of congestion as now)? .. Ans. 700. 


l I 
i : Mean Growth Curve d— 
I 


N (200,60) 


Peak Air Traffic 


Time, yeors © 


Figure P3.16 


he moment capacity M for the cantilever beam shown in Fig. P3.17 is 
2 5 A throughout ne entire span. Because of uncertainties in material 
strength, M is assumed to be Gaussian with mean 50 kip-ft and coefficient 
of variation 20%. Failure occurs if the moment capacity is exceeded anywhere 
mah thenly a concentrated load 3 kips is applied at the free end, what is the 
probability that the beam will fail? $ Ans, 0.023. . : 
(b) If only a uniform load of 0.5 kips/ft is applied on.the entire beam, what 
is the probability that the beam will fail? Ans. 0.006. 


PROBLEMS — 153 . 


Activity. C Activity D 


3 kip 


O5 Mp7 ft 


Moy | June t August I 


Figure P3.17 Figure P3.18 


(c) In rare cases, the beam may be subjected to the combination of the 
concentrated load and the uniform load; what will be the reliability 
(probability of no failure) of the beam when this case occurs? Ans. 0.308. 

(d) Suppose that the beam had survived under the concentrated load. What 
will be the probability that it will survive under the combinéd loads? 
Ans. 0.316. ` B T 

(e) Suppose that a reliability level of 99.5% is desired, and the beam is 

: subjected only to the uniform load w across the span. What will be the 
maximum allowable w? Ans. 0.484 kiplft. ; i 


3.18 A portion of an activity network is shown in Fig. P3.18; an arrow indicates 
the starting and ending of an activity. Activity C can start only after comple- 
tion of both activities 4 and B, whereas activity D can start only after com- 
pletion of C. A, B, C, D are statistically independent activities. 

The scheduled starting dates are as follows, and an activity cannot start 
earlier than its scheduled date. (For simplicity, assume -àll months have 


30 days.) 
Activities A & B: May 1 
Activity C : June t 
Activity D : August 1 


The times:required to complete each activity are Gaussian random variables 
as follows. 


Activity 4: N(25 days, 5 days) 
Activity B: N(26 days, 4 days) 
Activíty.C: N(48 days, 12 days) 
Activity D: N(40 days, 8 days) 


Assume that both activities A and B started on schedule, that is, ón May I. 
(2) Determine the probability that activity C will not start on schedule. 
Ans. 0.292. M 
(b) The availability of labor is such that unless C is started on schedule 
the necessary work force will be diverted to another project and thus 
will be unavailable for this activity for àt least 90 days. What is the 
probability that activity D will start on schedule? Ans. 0.596. 


~ 3.19 A contractor estimates that the expected time for.the completion of job A 


is 30 days. Because of uncertainties that exist in the Jabor market, materials 
supply, bad weather conditions, and so on, he is not sure-that he will finish,- 
the job in exactly 30 days. However, he is 90% confident that the job wil 


9 1 [OE 


154 ANALYTICAL MODELS OF RANDOM PHENOMENA 


be completed within 40 days. Let X denote the number of days required to 
complete job A. 

(a) Assume X to be:a Gaussian random variable; determine x and o 
and also the probability that X will be less than 50, based on the given 
information. Ans. 0.9948. MA a, "E 

(b) Recall that a Gaussian random variable ranges from. —oo to oo. 
Thus X may take on negative values that are physically impossible. 


Determine the probability of such an occurrence. Based on this result, © 


is the assumption of the normal distribution for X reasonable? Ans. 
0.00006. ` , 

(c) Let us now assume that X has a log-normal distribution with the same 
expected value and variance as those in the normal distribution of 
part (a). Determine the parameters 4 and £, and also the probability 

. that X will be less: than 50. Compare this with the result of part (a). 
Ans. 0.9817. 


3.20 From records of repairs of construction equipments, it is found that the 
failure-free operation time (that is, time between breakdowns) of an equip- 
ment may be modeled with a log-normal variate, with a mean of 6 months 
and a standard deviation of 1.5 months. As the engineer in charge of main- 
taining the operational condition of.a fleet of construction equipment, you 
wish to have at least a 90% probability that a piece of equipment will be 
operational at any time. ; 

(a) How often should each piece of equipment be scheduled for mainte- 
nance? Ans. 4.22 months. ` ; 

(b) If a particular piece of equipment is still in good operating condition 
at the time it is scheduled for mainteriance, what is the probability that 


it can operate for at least another month without its regular mainte- _ 


nance?. Ans. 0.749. . 


3.21 A system of stofm sewers is proposed for a city. In order to evaluate the 
_* effectiveness of the sewer system in preventing flooding of the streets, the 
following information has been gathered. Figure P3.21a shows the probability 

. mass function fór the number of occurrences of rainstorm each year in the 

city. Figure P3.216 shows the distribution of the maximum runoff rate in each 
storm, which is log-normal with a median of 7 cfs (cubic feet/sec) and COV 

of 15%. From hydraulic analysis, the proposed sewer system is shown to be 


pain) : tar 


Lognormol 
Medion * 7 
€.0.v. 215 Fo 


"^ "Maximum Runoff Rate, cfs 


Number Of Rainstorms in A Year 


Figure P3.2la ———— , Figure P3.21b - 


PROBLEMS 155 


adequate for any storm with runoff rate less than 8 cfs. Assume ‘that the 
maximum runoff rates between storms are statistically independent. 
." (a) What is the mean and variance of the number of rainstorms in a year 
for the city? E 
(b) What is the probability of flooding during a rainstorm? Ans. 0.187. 
(c) What is the probability of flooding in a year? Ans. 0.189. ` 


3.22. The depth to which a pile can be driven without hitting the rock stratum is. 
denoted as H (Fig. P3.22a). For a certain construction site, suppose that this 
depth has a log-normal distribution (Fig. P3.225) with mean of 30 ft and 
COV of 20%. In order to provide satisfactory support, a pile should be 
embedded 1 ft into the rock stratum. j 

(a) What is the probability that a pile.óf length 40 ft will not anchor 
„satisfactorily in,rock? Ans. 0.10. ` 

(b) Suppose a 40-ft pile has been driven 39 ft into the ground and rock has 
not yet been encountered. What is the probability that an additional 
5 ft of pile welded to the original length will be adequate to anchor this 
pile satisfactorily in rock? Ans. 0.71. i 


Ground Surface 


fU) 


Lognormal 


. : h, depth in ft. 
(a). - (b) 


Figure P3.22a E Figure P3.22b 


3.23 A water distribution subsystem consists of pipes AB, BC, and AC as shown ` 
in Fig. P3.23. Because of differences in elevation and in hydraulic head loss . 
in the pipes and associated uncertainties, the capacity of each pipe (which: : 

As defined as the maximum rate of flow) is given. as follows, in cfs (cubic : 
feet/sec): i ! : 

AB: capacity is Gaussian with mean 5, COV 1075 

BC: capacity is log-normal with median 5, COV 10% 

AC: capacity equal to 8 or 9. with equal likelihood 


(a) Determine the probability that the capacity of the branch ABC will 
exceed 4 cfs. Ans.'0.963. x : 


Figure P3.23 


sou ADA 2 RUA RU SEE AAU UA 


'(6) Determine the probability that the total capacity of the subsystem shown 
above will exceed 13 cfs. Ans. 0.607. (Hint. Use conditional prob- 
ability.) 


3.24 A construction project is at present 30 days away from the scheduled comple- i 


tion date. Depending on the weather condition in the next month, the time 
required for the remaining construction will have log-normal distributions as 


follows: 
Weather Time required (days) 
Good |n om 25, o=4 
Bad .. Median = 30, o=6 


Based on preliminary investigation, the.weather in the next month would be 
equally likely to be good or bad. 

(a) What is the probability that there will be a delay in the completion of 
` the project? Avis. 0.306. 


(b) A weather specialist is hired to obtain additional information on the f 


weather condition for the next. month. However, the specialist is not 
perfect in his prediction. In general, his predictions are correct 90% 
of the time, that is P(PG | G) = 0.9 and P(PB|B) = 0.9, where PG, 
PB denote the event that he predicts good and bad weather, respectively, 
and G, B denote the event that the weather is actually good and bad, 
respectively. Suppose that the specialist predicted good weather for the 
next month, What is the updated probability that there will be a delay 
in the completion of the project? Ans. 0.150. 


3.25 A compacted subgrade is required to have a specified density of 110 pef 
(pounds per cu ft). It will be acceptable if 4 out of 5 cored samples have at 
least the specified density. ; j uu. 

(a) Assuming each sample has a probability of 0.80 of meeting the required 
density, what is the probability that the subgrade will be acceptable? 
Ans. 0.737... ; 
(b) What should the probability of each sample be in order to achieve a 80% 
; probability of an acceptable subgrade? 

3.26 The following is the 20-year record of the annual.maximum wind velocity V 

in town A (in kilometers per hour, kph). ` 


Year V (kph) Year V (kph) 


1950 ` 78.2.. 1960. ^ -784 
1951 15.8: 1961 764 
1952  . 81.8 1962 . 3729 
1953 85.2 1963 16.0 
1954 ` 15.9 1964 79.3 
1955 18.2 1965 7714 
1956 72.3 1966 . TA 
1957 69.3 1967 80.8 
1958 161 1968 70.6 


1959 . — 748 - 1969 73.5 


PROBLEMS 157 


.(&) Based on this record, estimate the probability that V will exceed 80 kph 
in any given year. 

(b) What is the probability that in the next 10 years there will be exactly 3 
years with annual maximum wind velocity exceeding 80 kph? 

(c) If a temporary.structure is designed to résist a maximum wind velocity 
of 80 kph, what is the probability that this design wind velocity will 
be exceeded during the structure's lifetime of 3 years? z 

(d) How would the answer in part (c) change, if the design wind velocity is 
increased to 85 kph? ` 


3.27 The sewers in a city are designed for a rainfall having a return period of 10 


ears, . ; 
i Te What is the probability that the sewers will be flooded for the first time 
in the third year after completion of construction? 
(b) What is the probability of flooding in the first 3 years? 
(c) What is the probability of flooding in 3 of the first 5 years? 
(d) What is the probability of only one flood within 3 years? 


3,28 A preliminary planning study on the design of a bridge over a river recom- 
E mended a permissible probability of. 307; of the bridge being inundated by 
flood in the next 25 years. 
(a) If p denotes the probability that the design flood level for the bridge 
will be exceeded in 1 year, what should the value of p.be to satisfy the 
design criterion given above ?  [Hint. For small value of x, (1 — x)" e 
1-—nmx.] d 
(b) What is the return period of this design flood? Ans. 83.4 years. 


3.29 Figure P3.29 shows a 40-ft soil stratum where boulders are randomly depos- 
ited. Piles are designed to be driven to rock. For simplicity, assume that the 
stratum, can be divided into 4 independent layers of 10 ft each, that the 
probability of hitting a boulder within each 10-ft layer is 0.02, and that the 
probability of hitting 2 or more boulders within each layer is negligible. 

(a) What is the probability that a pile will be successfully driven to rock 
without hitting any boulder? . 

(b) What is the probability that it will hit at most 1 boulder on its way to 
rock? y 

(c) What is the probability that a pile will hit the first boulder in layer C? 

(d) Suppose the foundation of a small building requires a group of 4 such 
piles driven to rock. What is the probability that no boulders will be 


o i o LayerA lOft 
o ll o B [oft 
SSS oS SS een 
b es Ads cof 4 C [oft 
o D jloft 
Rock 


Figure P3.29 


Density 


158 - ANALYTICAL MODELS OF RANDOM PHENOMENA 


encountered in driving the piles? Assume that the pile-driving conditions 
between piles are statistically independent. 


3.30 The useful life per mile of pavement (Fig. P3.30) is described as a log-normal 


variate with a median of 3 years and COV of 50%. Life means the usable 
time until repair is required. Assume that the lives between any 2 miles of 
pavement are statistically independent. 
(a) What is the probability that a mile of pavement will require repair in a 
year? r . 
(b) Suppose that the design life is specified to be the 5-percentile life x o; 
(that is, the pavement life will be less than the design life with probability 
5%). Determine the design life. T . 
(c) What is the probability that there will be no repairs required in the 
first year of a 4-mile stretch of pavement? 
(d) What is the probability that 2 of the 4 miles will need repairs in the 
first year? 5 
(e) What is the probability of repairs of the 4-mile stretch in the first 3 
years of use? 
(f) What is the probability that the first repair of the 4-mile stretch will 
occur in the second year? (Note that the condition in the second year 
is not independent of the first year.) Ans. 0.543. 


zem ee 


Lognormol 
Medion = 3 yeors 
6.0. v, * 0,50 


9 3 


Life per mile; yeors 
Figure P3.30 . = | -Figure P3.31 


3.31 The maximum annual flood level of a river is denoted by H (in meters). 
Assume that the probability density of H is described by the triangular 
distribution shown in Fig. P3.31. ! 


(a) Determine the. flood height / which has a mean recurrence interval 
(return period) of 20 years. : . 

(b) What is'the probability that during the next 20 years the river height H 
will exceed Ao at least-once? — ' : 

(c) What ‘is the probability that during the next 5 years the value of Asg 
will be exceeded exactly once? > 

(d) What is the probability that As will be exceeded at most twice during 
the next 5 years? 


3.32 For the river in Problem 3.31, a control dam will be constructed according to ` 


\ PROBLEMS: 159 


the following specification. The height of the dam will be so selected that in the 

‘next 3 years this height will be safe against floods with a probability of 94%. 

: (a) Determine the required return period of the design flood. Ans. 50 years. 
(b) Determine the design height that will meet this requirement. Ans. 
` 68m. 


3.33 For quality control purposes, 3 specimens in the form of 6-in.-diameter 
cylinders are taken at random from a batch of concrete, and each specimen 
is tested for its compressive strength. A specimen will pass the strength test if 
it survives an axial compressive load-of 11 kips. From previous record, 
the contractor concludes that the histogram of crushing strength of similar 
concrete specimens can be satisfactorily. modeled by a normal distribution 
with mean 14.68 kips and standard deviation 2.1 kips, that is, N(14.68, 2.1). 

(a) What is the probability that'a specimen picked at random will pass the 
test? ERN d 

(b) If the specification requires all 3 specimens to pass the test for the 
batch of concrete to be acceptable, what is the probability that a batch 
of concrete prepared by this contractor will be rejected? 

(c) The contractor prepares a batch of concrete each day. What is the 
probability that at most one batch of concrete will be rejected for a 
2-day period? 

(d) Repeat part (b), if the specification is relaxed so that one failure out of . 
the 3 specimens tested is allowed. ; 

(e) The contractor may use a better grade of concrete mix, and together 

. with better workmanship arid supervision, he can improve the mean 
crushing strength of concrete specimen to 16.5 kip, while reducing the 
coefficient of variation to 90% of its previous value. What is the prob- 
ability for a batch to be acceptable now? Assume that the crushing 
strength of the concrete is a normal variate, and no failures are allowed 
in the 3 specimens tested. Ans. 0.986. i i 

3,34 Three flood control dikes are built to prevent flooding of the low plain as 
shown in Fig. P3.34. The dikes are designed as follows... 

(i) Design fiood of Dike I is the 20-year flood of river £.. 

(ii) Design flood of Dike II is the 10-year flood of river A., 

(iii) Design flood of Dike III is the 25-year flood of river B. 

Assume that the floods in rivers A and B are statistically independent; also; 
the failures of dikes I and II are statistically independent. | 

(a) Within a year, determine the probability of flooding of the Jow plain 
caused by river A only. Ans. 0.145. Í 


Low Plain 


24. " 
Sr 


Figure P3.34 Figure P3.35 


River B 


aus AR AMÁRIYAZUAZIYA B OREKEALYAZITAR AY 4A ‘i 
f 


(b) What is the probability of. flooding of the low plain area in a year? 
Ans. 0.179. d 


(c) What is the probability of no flooding of the low plains in 4 consecutive 
years? Ans. 0.454. i 


3.35 A county is bounded by streams A and B (Fig. P3.35). From flow record, the 


3,36 


annual maximum flow in 4 may be modeled by a normal distribution with 


. mean 1000 cfs and COV 20%, whereas that in B may be modeled by a log- 


normal distribution with mean 800cfs and COV 20 7$. The capacities 
(defined as the maximum flow that can be carried without overflowing) of 
A and B are 1200 and 1000 cfs, respectively. Assume the stream flows in A 
and B are statistically independerit. 
(a) What is the probability that stream A will overflow in a year? 
(b) What is.the probability that stream B will overflow in a year? - 
(c) What is the. probability that the county will be flooded in a year? 
(d) yoa i the probability that the county will be free of floods in the next 
years? 
(e) If it is decided to reduce the probability of overflow in stream A to 5% 
a year by enlarging the stream bed at critical locations, what should be 
the new capacity of 4? j 
(f) Suppose that, because of error in prediction, the capacity of stream 
B may not be 1000 cfs, and there is a 20% chance that the capacity may 
be 1100 cfs. In such a case, what is the probability that stream B will 
overflow in a year? 


A cofferdam is to be built around a proposed bridge pier location so that 
construction of the pier may be carried out “dry” (see Fig. P3.36). 

The height of the cofferdam should protect the site from overflow of wave 
water during the construction period with a reliability of 95%. The distribu- 
tion of the monthly maximum wave height is Gaussian N(5,2) ft above mean 


' sea level. 


_ (a) If the construction will take 4 months, what should be the design height 
of the cofferdam (above mean sea level)? Assume that monthly maxi- 
mum wave heights are statistically independent. Ans. 9.46 ft. 

(b) If the time of construction can be shortened by 1 month with an addi- 
tional cost of $600, and the cost of constructing the cofferdam is 


$2000 per ft (above mean sea level), should the contractor take this . 
alternative? Assume that the same risk of overflow of wave water still . 


applies. 


Bridge Pler 


B 


Cotte rdam Meon Seo Level 


Figure P3.36 


\ 
\ 


337, 


3.38 


3,39 


3.40 


3.417 


cnvpuome. dua 


A contractor owns 5 trùcks for use in his construction jobs. He decides to 
\institute a new program of truck replacement, using the following procedure: 
 () Any truck that has had more than 1 major breakdown on the job 
. — within a year will be evaluated to determine how many miles it gets per 
. gallon of gas. ; 
(ii) Any truck given this special evaluation will be replaced if it gets less than 
9 miles per gallon. ; ` 
From prior experience, the contractor knows two facts with a high degree 
of confidence: (i) for each truck, the mean rate of major breakdowns is once 
every 0.8 year; and (ii) the gasoline consumption of trucks that have more 
than 1 major breakdown is a normal variate N(10, 2.5) in miles per gallon. 
(a) What is the probability that a given truck will have more than 1 break- 
down within a year? vu E 3 
(b) What is the probability that a truck getting.a special evaluation will 
fail to meet the miles-per-gallon test [see part (ii) above]? 
(c) What is the probability that a. given truck will be replaced within a 
ear? : 
(d) What is the probability that the contractor will replace exactly 1 truck 
within a year? : 
On the average 2 damaging earthquakes occur in a certain country every 
5 years. Assume the occurrence of earthquakes is a Poisson process in time. 
For this country, complete the following. 
(a) Determine the probability of getting 1 damaging earthquake in 3 years. 
(b) Determine the probability of no earthqugKes in 3 years. : 
(c) What is.the probability of having at móst 2 earthquakes in one year? 
(d) What is the probability of having at least 1 earthquake in 5 years? 


(a) The occurrences of flood may be modeled by a Poisson process. If the 
mean occurrence rate of floods for a certain region A is once every 8 
years, determine the probability of no floods in a. 10-year period; of 1 

s flood; of more than 3 floods. en 

(b) A structure is located iri region 4. The probability that it will be inundated, 
when a flood occurs, is 0.05. Compute the probability that the structure 
will survive if there are no floods; if there is 1 flood; if' there are n floóds. 

- Assume statistical independence between floods. A 

(c) Determine the probability that the structure will survive over the 10-year 

period. Ans. 0.939. : ! 


Traffic on a one-way street that leads to a toll bridge is to be studied. The 


' volume of the traffic is found to be 120 vehicles per hr on the average and 


out of which $ are passenger cars and 1 are trucks. The toll at the bridge is 
$0.50 per car and $2 per.truck. Assume that the arrivals of vehicles constitute a 
Poisson process. Te è : 
(a) What is the probability that in a period of | minute, more than'3 
vehicles will arrive at the toll bridge? Ans. 0.1429. 
(b) What is.the expected total amount of toll collected at the bridge in a 
. period of 3 hr? . | 
Strikes among construction workers occur according to the Poisson process; 
on the average there is one strike every 3 years. The average duration of a 
strike is 15 days, and the corresponding standard deviation is 5 days. 
If-it costs (in terms of losses) a contractor $10,000 per day of strike, answer 
the following. j : : $ 


162 


, ANALYTICAL MODELS OF RANDOM PHENOMENA ] 


H A i 
(a) What would be the expected loss to the contractor during a strike? 
(b) If the strike duration is a normal variate, what is the probability that the 
contractor may lose in excess of $20,000 during a strike? boc 
(c) Inajob that will take 2 years to complete, what would be the contractor's 
expected loss from possible strikes? (Remember that the occurrence of 
strikes isa Poisson process) Ans. $700,000. i 


3.42 The service stations along a highway are located according to a Poisson 


343 


process with an average of 1 service station in 10 miles. Because of a gas 
shortage, there is a probability of 0.2 that a service station would not have 
gasoline available. Assume that the availabilities of gasoline at different 
Service stations are statistically independent. 

(a) What is the probability that there is at most I service station in the next 
15 miles of highway? ` 

(b) What is the probability that none of the next 3 stations have gasoline 
for sale? 

(c) A driver on this highway notices that the fuel gauge in his car.reads 
empty; from experience he knows that he can go another 15 miles. 
What is the probability that he will be stranded on the highway without 
gasoline? ^ s : . 

Express rapid-transit trains run between two points (for example, between 
downtown terminal and airport). Suppose that the passengers arriving at 


` ` the terminal and bound for the airport (Fig. P3.43) constitute a Poisson 


3.44 


Figure P3.43.. 


' process with: an average rate of 1.5 passengers per minute. If the capacity 


of the train is 100. passengers, how often should trains leave the terminal so 
that the probability of Gyercrowding is no more than 10%? 
(a) Formulate the problem exactly. . M. 
(b) Determine an approximate solution by assuming that the number of 
airport-bound passengers is Gaussian with the same mean and standard 
` deviation as the preceding Poisson distribution. _ i 
(c) M the trains depart from the terminal according to the schedule of 
‘part (b), what is the probability that in 5 consecutive departures 1 will 
be overcrowded? Assume statistical independence. 


A large radio antenna system consisting of a dish mounted on a truss (see 
Fig. P3.44) is designed against wind load, Since damaging wind storms rarely 
„occur, their occurrences may be modeled by a Poisson process. Local 
weather records show that during the past 50 years only 10 damaging wind 
Storms have been reported, Assume that if damaging wind storm (or storms) 


‘occur in this period, the probabilities that the dish and.the truss will be 


damaged in a storm are 0.2 and 0.05, respectively, and that damage to the 


T ae Dish 


=T Airport “Truse. 


E 


n -N Passengers - - : POM | 


F igure P3.44 


3.45 


3.46 


347 


PROBLEMS 163 


"dish and truss are -statistically independent. Determine the probabilities, 


during the next 10 years, for the following events. 
'(a) There will be more than 2 damaging wind storms. 
(b) The antenna system will be damaged, assuming the occurrence of at 
most 2 damaging storms. d A 
(c) The antenna system will be damaged. 


The problem in Example 3.17 may be solved by assuming that whenever the 
center of a 12-in.-diameter boulder is inside the volume of a cylinder with 
15 in. diameter and 50 ft depth, it will be hit by the 3-in. drill hole, On this 
basis and the assumption that the occurrence of boulders in the soil mass 
constitutes a Poisson process, develop the corresponding solution procedure 
for determining the probability of the 3-in. drill hole hitting boulders in a 
50-ft depth boring. 


Suppose that the hurricane record for the last 10 years at a certain coastal city 
in Texas is as follows. 


Year No. of hurricanes 


1961 
1962 
1963 
1964 
1965 
1966 
1967 
1968 
1969 
1970 


—"—-Moo-€vwooco- 


"Thé occurrence of hurricanes can be described by a Poisson process. The 
maximum wind speed of hurricanes usually shows considerable fluctuation. 
Suppose that those recorded at this city can be fitted satisfactorily by a log- 
normal distribution with mean = 100 ft/sec and standard deviation = 
20 ft/sec. 

(a) Based on the available data, find the probability that there will be at 
least 1 hurricane in this city in the next 2 years. Ans. 0.798. 

(b) Ifa structure in this city is designed for a wind speed of 130 ft/sec, what 
is the probability that the structure will be damaged (design wind speed 
exceeded) by the next hurricane? Ans. 0.08. 

(c) What is the probability that there will be at most 2 hurricanes in the 
next 2 years, and that no structure will be damaged during this period? 
“Ans. 0.718. 


Tornadoes may be divided into two types, namely I (strong) and II (weak). 


From 18 years of record in a city, the number of type'I and type II tornadoes 
are 9 and 54, respectively. The occurrences of each type of tornado are 


. assumed to be statistically independent and constitute a Poisson/process. 


(a) What is the probability that there will be exactly 2 tornadoes in the city 
next year? 5 ` 


EZ 


ANALYTICAL MODELS OF RANDOM PHENOMENA a 


(b) Assuming that exactly 2 tornadoes actually occurred, and 1 of the 2is 
known to be of type I, what is the probability that the other ‘is also 
type I? k 


3.48 Figure P3.48a shows a record of the earthquake occurrences in a county where 


a brick masonry tower is to be built to last for 20 years. The tower can with- 
stand an earthquake whose magnitude is 5 or lower. However, if quakes with 
magnitude more than 5 (defined as damaging quake) occur, there is a likeli- 
hood that the tower may fail. The engineer estimated that the probability 
of failure of the tower depends on the number of damaging quakes occurring 
during its lifetime, which is described in Fig. P3.485. 

(a) What is the probability that the tower will be subjected to less than 3 
damaging quakes during its lifetime? Assume earthquake occurrences 
may be modeled by a Poisson process. ` 

(b) Determine the probability that the tower will not be destroyed by earth- 
quakes within its useful life. —— 

(c) Besides earthquakes, the tower may also be subjected to the attack 
of tornadoes whose occurrence may be modeled by a Poisson process 
with mean recurrence time of 200 years. If a tornado hits the tower, the 
tower will be destroyed. Assume that failures caused by earthquakes 
and tornadoes are statistically independent. What is the probability 
that the tower will fail by these natural hazards within its useful life? 


i 7 I E 10 
i H 
£s = os 
1 
: ^ oI E 4 5 
50-Yoor Record n, no, of domoging quokes 
Figure P3.48a Figure P3.48b 


3.49 


3.50 


A skyscraper is located in a region where earthquakes and strong winds 
may occur. From past record, the mean rate of occurrencé of a large earth- 


: quake that may cause damage to the building is,1 in 50 years, whereas 


that for strong wind is t in 25 years. The occurrences of earthquake and 
strong wind may be modeled as independent Poisson processes. Assume 
that during a strong earthquake, the probability of damage to the building 
is 0.1, whereas the corresponding probability of damage under strong wind 
is 0.05. The damages caused by earthquake and wind may be assumed to be 
independent events. $ 
(a) What is the probability that the skyscraper will be subjected to strong 
winds but. not large earthquakes in a 10-year period? Also, determine 
the probability of the structure subjected to both large earthquakes 
and strong winds in the 10-year period. 
(b) What is the probability that the building will be damaged in.the 10-year 
period? i . E 
The daily water consumption of a city may be assumed to be a Gaussian 
random variable with a mean of-500,000 gal/day: (gpd), and a standard 


B 
| 


i 


PROBLEMS 165 


deviation of 150,000 gpd. The daily water supply is either 600,000 or 750,000 
gallons, with probabilities 0.7 and 0.3, respectively. 
(a) What is the probability of water shortage in any given day? 
(b) Assuming that the conditions between any consecutive days are statistic- 
ally independent, what is the probability of shortage in any given week? 
(c) On the average, how often would water shortage occur? If the occur- 
rence of water shortage is a Poisson process, what would then be the 
probability of shortage in a week? 
(d) If the city engineer wants the probability of shortage to be no more 
than 1 % in any given day, how much water supply is required? 


Steel construction work on multistory buildings is a potentially hazardous 
occupation. A building contractor who is building à skyscraper at a steady 
pace finds that in spite of a strong emphasis on safety measures, he has been 
experiencing accidents among his large group of steel workers; on the average, 
about 1 accident occurs every 6 months. 

(a) Assuming that the occurrence of a specific accident is not influenced 

-by any previous accident, find the probability that there will be (exactly) 
1 accident in the next 4 months. ` 

(b) What is the probability of at least 1 accident in the next 4 months? 

(c) What is the mean number of accidents that the contractor can expect 
in a year? What is the standard deviation for the number of accidents 
during a period of 1 year? : 

(d) If the contractor can go through a year without an accident among his 
steel construction workers, he will qualify for a safety award. What is 
the probability of his receiving this award next year? 

(e) If the contractor’s work is to continue at the same pace over the next 
5 years, what is the probability that he will win the safety award twice 
during this 5-year period? 


Two industrial plants are located along a stream (see Fig. P3.52). The solid 
and liquid wastes that are disposed from the plants into the stream are called 
effluents, In order to control the quality of the effluent from each plant, there 
is an effluent standard established for each plant. Assume that each day, the 
effluent of each plant may exceed this effluent standard with probability 
P = 9.2, during the actual operation. A good measure of the stream quality 
at A as a result of the pollution from. these effluent wastes is given by the 
dissolved oxygen concentration (DO) at that location. Assume that the DO 
has a log-normal distribution with the following medians and COV (in mg/l). 


Median COV 


4.2 0.1 when both effluents do not exceed standard 
2.1 0.15 when only. effluent exceeds standard 
1.6 . 0.18 when both effluents exceed standard 


(a) What is the probability that the DO concentration at A will be less than 
2 mg/l in any given day? 

(b) What is the probability that the DO concentration at A will be less than 
2 mg/l in two consecutive days? - mE 


166 ANALYTICAL MODELS OF RANDOM PHENOMENA 


Plant I 


Streom. Brioni II 


Figure P3.52 


(c) It has been proposed as a stream standard that the probability of DO 
concentration at A falling below 2 mg/l i in a day should not exceed 0.1. 
What should be the allowable maximum value of p (the probability of 
exceeding the effluent standard for each plant)? 


3.53 The daily concentration of a certain pollutant in a stream has the exponential 
distribution shown in Fig. P3.53. 

(a) If the mean daily concentration of the pollutant is 2 mg/10? liter, 
determine the constant c in the exponential distribution. 

(b) Suppose that the problem of pollution will occur if the concentration of 
the pollutant exceeds 6 mg/10? liter. What is the probability of pollution 
problem resulting from this pollutant in a single day? 

(c) What is the return period (in days) associated with this concentration 
level of 6 mg/10? liter? Assume that the concentration of the pollutant 
- is statistically independent between days. Aus. 20 days. 

(d) What is the probability that this pollutant will cause a pollution problem 
at most once in the next 3 days? Ans. 0.993. 

e) If instead of the exponential distribution, the daily pollutant concentra- 
tion is Gaussian with the same mean and variance, what would be the 
probability of pollution in a day in this case? . Aus, 0.022, 


fo 


f x)s ce ^, c= constant 


0^ 2 4 6 8^ 
x, concentration (mg/104) 


Figure P3.53 Ti 


-3.54 The interarrival times of vehicles on a road follows an exponential distribu- 
tion with a mean óf 15 sec. A gap of 20 sec is required for a car from a side 
street to cross the road or to join the traffic. 

(a) What is the proportion of gaps that are less than 20 sec? è 
(b) What is the average (mean) interarrival time for all the gaps that are 
longer than 20 sec? 


3.55 


3.56 


3.57 


3.58 


PROBLEMS 167 


(c) In 1 hr, what is the expected total time occupied by gaps that are less 
` than 20sec? (Hint. What is the expected number of gaps that are 
less than 20 sec in 1 hr?) 


The occurrences of tornadoes in a midwestern county may be modeled by a 
Poisson process with a mean occurrence rate of 2.5 tornadoes per year. 
(a) What is the probability that the recurrence time between tornadoes will 
be longer than 8 months? 


. (b) Derive the distribution of the time till the occurrence of the second 


tornado. On the basis of this distribution, determine the probability 
that a second tornado will occur within a given year. 


The time of operation of a construction equipment until breakdown follows 
an exponential distribution with a mean of 24 months. The present inspection 
program is scheduled at every 5 months. 

(a) What is the probability that an equipment will need repair at the first 
scheduled inspection date? 

(b) If an equipment has not broken down by the first scheduled inspection 
date, what is the probability that it will be operational beyond the next 
scheduled inspection date? 

(c) The company owns 5 pieces of a certain type of equipment; assuming 
that the service lives of equipments are statistically independent, 
determine the probability that at most 1 piece of equipment will need 
repair at the scheduled inspection date. 

(d) If itis desired to limit the probability of repair at each scheduled inspec- 
tion date to not more than 10%, what should be the inspection interval? 
The conditions of part (c) remains valid. 


The cost for the facilities to retease and refill water for a navigation lock in a 
canal increases with decreasing time required for each cycle of operation. 
For purposes of design, it has been observed that the time of arrival of boats 
follows an exponential distribution with a mean interarrival time of 0.5 hr. - 
Assume that the navigation lock is to be designed so that 80% of the i incoming 
traffic can pass through the lock without waiting. 

(a) What should be the design time of each cycle of operation? Ans. 0.1 hr. 

(b) What is the probability that of 4 successive arrivals, none of them have 
to wait at the lock? Ans. 0.41. 

(c) Suppose that one boat leaves town 4 every 8 hr, and has to go through 
the lock to reach its destination. What is the probability that at least 1 
of the boats leaving town A in a 24- hr day has to wait at the lock? 
Ans. 0. 488. E 


A pipe carrying water is supported on short concrete piers that are spaced 
20 ft apart as shown in Fig. P3.58a.The pipe is saddled on the piers as shown 
in Fig. P3.585. When subjected to lateral earthquake motions, there is a 
horizontal inertia force that will tend to dislodge the pipe from its supports. 
The maximum lateral inertia.force F at each pier, may be estimated as 


w 
F--—'a 


5 
where 
w = the weight of the pipe and water for a 20-ft section; 
i g= acceleration of gravity — 32.2 ft/sec?; 
à = maximum horizontal earthquake acceleration. 


168 ANALYTICAL MODELS OF RANDOM PHENOMENA 


(o) E (b) 
Figure P3.58a ' . Figure P3.58b 


The pipe has a diameter of 4 ft, so that the total weight per foot of pipe 
and contents is 800 Ib per ft. Assume that the maximum acceleration during a 
strong-motion earthquake is a log-normal variate with a mean of 0.4g 
and a COV of 25%. s 


(a) What is the probability that during such an earthquake, the pipe will. 


be dislodged from a pier support (by rolling out of the saddle)? 

(b) If there are 5 piers supporting the pipe over a ravine, what is the prob- 

ability that the pipe will not be dislodged anywhere? Assume the 
. conditions between supports to be statistically independent. 

(c) 1f the occurrence of strong-motion earthquakes is a Poisson process, 
and such earthquakes are expected (on the average) once every 3 years, 
what is the probability that the pipe may be dislodged from its supports 

. over a period of 10 years? 
3,59 Ten percent of the 200 tendons required to prestress a nuclear reactor structure 
have been corroded during the last year. Suppose that 10 tendons were 


selected at random and inspected for corrosion; what is the probability ` 


“that none. of the tendons inspected show signs of corrosion? What is the 
probability that there will be at least one corroded tendon among those 
inspected? ' 

3.60 The fill in an earth embankment is compacted to a specified CBR (California 
Bearing Ratio). The entire embankment can be divided into 100 sections, 
of which 10 do not meet the required CBR. A 

(a) Suppose that 5 sections are selected at random and tested for their CBR, 
and acceptance here requires all 5 sections to meet the CBR limit, 
What is the. probability that the compaction of the embankment will be 
accepted? . : 

(b) If, instead of 5, 10 sections will be inspected and acceptance requires all 
10 sections meeting the CBR limit. What is the probability of acceptance 
now? 


Section 3.3 


3.61 Both east and-west bound rush-hour traffic on a toil bridge are counted at 
10-se¢ intervals. The following table shows the number of observations for 


FRUBLEMD 409 


each combination of east and west bound traffic counts: 


Number of westbound vehicles ` 


sapolyoa punogisee Jo JoqunNy 


Total number of observations = 665 


Let X = number of eastbound vehicles in a 10-sec interval. 
Y = number of westbound vehicles in a 10-sec interval. 


(a) Compute and plot the joint probability mass function of X and Y. 

(b) Determine the marginal PMF of X. 

(c) If there are 3 eastbound vehicles on the bridge in a 10-sec interval, 
determine the PMF of westbound vehicles in the'same interval. 

(d) In a 10-sec interval, what is the probability that 4 vehicles are going 
east if there are also 4 vehicles going west at the same time? 

(e) Determine the covariance Cov (X, Y), and evaluate the corresponding 
correlation coefficient between X and Y. | : : i 


3.62 The joint density function of the material and labor cost of a construction 
` project is modeled as follows: ` 


fxr » = 2ye (e x,y 20 
; =0. elsewhere 
where X = material cost in $100,000 
Y = labor cost in $100,000 


(a) What is the probability that the material and labor costs of the next 
construction project will be less than $100,000 and $200,000, respec- 
tively? P 

(b) Determine the marginal density function of material cost in a project. 

(c) Determine the marginal density function of labor cost in à project. 

(d)- Are the material and labor costs in the construction project statistically 
independent? Why? 

(e) If it is known that the cost of material in the project is $200,000, what 
is the probability that its labor cost will exceed $200,000? 


4. Functions of Random 
Variables 


4.1. INTRODUCTION 


Engineering problems often involve the evaluation of functional relations 
between a dependent variable and one or more basic (independent) vari- 
ables. If any of the basic variables are random, the dependent variable 
will likewise be random; its probability distribution, as well as its moments, 
will be functionally related to and may be derived from those of the basic 
random variables. 


D 


` 4&2. DERIVED PROBABILITY DISTRIBUTIONS 
4.2.1. Function of single random variable 
: Consider first the function of a single random variable, 
Y-e4X) C -an 


` This means that when Y = y, X = 2 = q^ (y) where g~ is the inverse 
function of g. [Assume for the moment that g(x) is a monotonically in- 
creasing function of z with a unique inverse g7!(y).] Thus 


P(Yy-y)-P(X-z)-P[X = g?()] 
That is, the PMF of Y is 2.5 a 


ome) = pel) (4.2) 


Also, it follows that 
P(Y < y) = PIX € g*()] 
Thus 


Hence, for discrete X, 


Fry) = 2, pe). (4.4) 


all zi giu) 


170 


Fy(y) = Pur? (y)J (43) 


4.2. DERIVED PROBABILITY DISTRIBUTIONS . 171 


whereas, for continuous X, Eq. 4.3 yields 
: = 
Foy) =f &oe-[ o no (4.5) 
{zSo7@)} um 3 
In the latter case (that is, X continuous), we recall from calculus that by 
making a change of the variable of integration, Eq. 4.5 becomes 
ig 


rof ene f. fei) Fay 


where g = g"'(y). Therefore the density function of Yis 


Ben 


fey) = EE 


This assumes that y increases with x. When y decreases with i increasing 3 
Fy(y) = 1 — Fx(g?); then : 


BOR -aog I 


However, in this latter case (dg/dy) is afatia Properly then the de- 
rived density function is 


fry) = f 


Gl ae 


EXAMPLE 4.1 


Suppose that X is a normal variate with parameters u and c. Determine the 


density function of Y = (X — p)/o. 


The inverse function is x = cy + 4, and $ = o. Thus Eq. 4.6 yields 
Tb UEM -oy t 4 ey 
fe m sem a ] e 


"Therefore Y is a standard normal variate with density function N(0, 1). 


EXAMPLE 4.2 > ; à ES 


If X has a log-normal distribution with parameters 4 and i what is the distribu- 
tion of Y = In X? In this case 


In x — A? 
fx@ = Vin Tex? | - x y] 


172 FUNCTIONS OF RANDOM VARIABLES 


and 


Therefore, according to Eq. 4.6, 
1 I 1fy —a\? 
fro = Vga 7? (ae) ] Ld 
i 1. I/y —a\? 
= Vint ee -3( t )] 


Hence the distribution of Y is normal with mean value A and standard deviation Z; 
that is, Edn X) = å, Var (In X) = &. 


We observe that the inverse function g7!(y) may not be single-valued; 
that is, there may be multiple values of z for a given value of y. In such 
cases, if g7(y) = a, 2», +++, a, we have 


1 ; 
OY =y) = U(X =4%) 


del 


Hence, for discrete X, 


A k ‘ 
pr(y) = pxl) - (4.7) 


i=l 


. wherens, if X is continuous, 


k d Da 
fry) = Lifer) |= 


i=l 


(4.8) 
in which g;? = zi, is the ith root of ry). 
EXAMPLE 43 l 


_ The strain energy in a linearly elastic bar subjected to a force S is given by 
ne 
U= 24ES. - 
where . 
j L = length of the bar 


-` A = cross-sectional area of the bar, ' 
E = modulüs of elasticity of the elastic material 


Then, if S is a standard normal variate N(0, 1), the density function of Uis obtained 
on the basis of Eq. 4.8 as follows. TE: D 5 : 
Rewriting L 
U = cS? 


E ur 
LEN 
c 


where c = L/2AE, we have — 


4.2. DEKIVED FKUBADILI2 I BIDLRIDULIUND 460 


[n 


o 


Figure E4.3 


and thus’ . 
ds 1 i 
= = += , ` 
du 2Vcu 
or t 
ds » AT 
du | 22Vor - 


Hence the density function of the strain energy U, in accordance with Eq. 4.8, is 
u {fu 1 
roto = [s ff) e s 
. d u 
(ROLE) ee 


which is a chi-square-type distribution with one degree of freedom (see Eq. 5.40 of . 
Chapter 5). Graphically, this distribution would appear as shown in Fig. E4.3. 


EXAMPLE 4.4 


The height of earth dams must allow sufficient freeboard above the maximum 
reservoir level to prevent waves from washing over the top. The determination of 
this height would include the consideration of wind tide and wave height. 

The wind tide, in feet, above still-water level is u^. o 
ms 
“© 1400-d 
where wy A 


z ye 


_V = wind speed in miles per hour , A au 
F = fetch, or length of water surface over which the wind blows, in feet. 
d = average depth of lake along the fetch, in feet 


174 FUNCTIONS OF RANDOM VARIABLES 


If the wind speed has an exponential distribution with mean speed vg; that is 
| 1 
fro) — —e* ^ v20 
vo 


=0 <0 
then we determine the distribution of the tide Z as follows. 


Denoting a = F/1400 d, we have Z = aV?; thus’ 


na f 
a 


dv 
dz 


and 
T 
 2Var 


Then, according to Eq.-4.8, 


0 - Do l-A 


However, in this case since fp (x) = 0 for x <0, we have 
i 1 fe 
fze = b) vi 


de NU 
2mp Vaz PV wx a] 229 
4.2.2. Function of multiple randóm variables 


Next, eonsider the funetion of two. random variables X and Y, 


Z =Z, Y) e^ - (49). 
' In this ease, (Z — z) refers to the same event as D, Y) = z]; that is, 
(Z = 2) =[9(X, Y) =z] = U. (X- ai, Y =.y;) 
. {olti vo z] 
-Hence, the PMF of Z is 
pez) = È perlau) : (4.10) 


di, yim e 


‘and the corresponding CDF is 


Fez) = pri (wi, y). (4.11) 


oa, vi) Se 


In particular, if Z= X TY, 


pz(z) = > Px, rizi; v). = diis — 2). ` (412) 


mebujca all zi 


4.2. DERKIVED PKUBABILITX DISHIIGDBUIIUINS ar 


Sum of Pando variables with Poisson distributions. Suppose that 
X and Y are statistically independent and have Poisson distributions with 
parameters v and p, respectively; that is, 


pr(y) = 


Then according to Eq. 4.12, the PMF of Z=X+ Vis 
pale) = È pxx (2,2 — 2) 


all z 


(ot) * (ut) 77 g- eit 


aire &! (2 — g)! 


yt 
zg- ota. ee 
eee 2 a@net 

But the sum is the binomial expansion of (v + »)*/z!; thus: 


pz(2) ist Ce, € Du g Cnt 


which means that Z also has a Poisson distribution with parameter G + à). 


Generalizing this result, we infer that the sum of two or more independent 
Poisson processes is also a Poisson process; tbat.is, if : 


Z= Lx 


where X; has a Poisson PMF with parameter is the PMF of Z is also a 


- Poisson distribution with parameter 


yg = Sy Pe 20 (4.18) 7 


i=l 


However, the difference of two Poisson processes is not Poisson process; 
that is, it can be shown that the PMF of Z = X-Y does not yield-a Pois- 


son distribution. 


EXAMPLE 4.5 


Suppose that a toll bridge serves three suburban residential districts 4, B, C (see 
Fig. E4.5). It is estimated that during peak hours of the day, the average volumes of 


176 FUNCTIONS OF RANDOM VARIABLES 


Figure E4.5 


traffic from each of these three districts are, respectively, 2, 3, and 4 vehicles 

minute. If the peak vehicular traffic from the respective ‘districts isa Poislor a 
process, the traffic crossing the toll bridge would also be a Poisson process wi h 
average crossing volume of 9 vehicles per minute. p fae 


‘If X and Y are continuous, Eq. 4.11 becomes 


Fz(z) 


l 


JI fx. v (m, y) dz dy 


{o@ y) <z} 


= ff texte, ins a) 


where g! = g7!(z, y): Changing the variable of integrati ; ; 
Nod A g l [o "daas from z to z, 


o pt ? fasi 
> Fz(2) pa l.4 | Ln 
| Lr f xx ( s y) a; | ZW 
Thus the PDF of Z is 
falz) = š E E | | 
z f E (g^, y) Ew dy (4.15) 


: Alternatively, taking g7! = g-!(z, z), we also have 


a Ere 
| fale). =f tere n) || as (4.150) 
: Specificially, if : l 
Z =aX +bY 
we have 
PELA ind ag’ oe 1 


a oz oz a 


4.2. DERIVED PROBABILITY DISTRIBUTIONS 177 


Then Eq. 4.15 would be 


fe(z) = a PL ( = y) dy (4.16) 
and if X and Y are statistically independent, : 
* o PS b 
fz(2),7 ii [5 [ " J fv(y) dy M) . 
or, based on Eq. 4.15a; . 
fao = TES f "fr (x) fv (5 =) dx (4.165) 


EXAMPLE 4.6 


Shown in Fig. E4.6 is an idealized model of a one-story building, with the total 
mass m concentrated at the roof level. When subjected to earthquake ground shaking, 
the building will vibrate about its original (at rest) position, inducing velocity 
components X and Y of the mass, with a resultant velocity Z — vVX*3 Y 

If X and Y are, respectively, standard normal variates, that is, with distribution 
N(0, 1), determine the probability distribution of the resultant kinetic energy of the 
mass during an earthquake. i 

The resultant kinetic energy is 


W = mZ? = m(X* + Y?) 
Let U = mX?, and. V = mY?; then 
W=U+4+V 


From Example 4.3, we see that the distributions of U and V are, respectively, chi- 
square with one degree of freedom; that is, E 


fv = A gm o wm. 


Zamu , 
H 1 s 
v) = em y>0 
fro) VIT ) 
(0) Elevation - : . (b) Plon 


Figure E4.6 (a) Elevation. (b) Plan ` 


178 FUNCTIONS OF RANDOM VARIABLES - 


Then, according to Eq. 4.16b and observing that v = w — u > 0, we obtain tlie 
density function of the kinetic energy W as follows: 


1 "E ; 1 
w) = — gum. r-(w—u)/2m 
fo) oum Í Va e Vc e e 


1 Ed : 
am em uy — uy"? du 
arm Jo 


Now let r = u/w; then du = w dr, and 
1 " 1 A 
(w) = — gm f pan — py dr 
Sw 2rm o 


It can be observed that the above integral is the beta function BO: D of Eq. 3.49; 
furthermore, using Eq. 3.49a and observing that I'(3) = Vx and I(1) = 1.0, we 
have 


Ter 
EID 


Hence 
1 
-guwim 
tw) om 


which is a chi-square-type distribution Rue Eq. 5.40 of Chapter 5) with two dires 
of freedom. 


Sum (and difference) of independent normal: variates. If X and Y 
are statistically independent normal variates with means and standard 
deviations ux, ox and uy, ay, respectively; the distribution of Z = X + Y, 
according to Eq. 4.16a, is 


mms CEP cw) 
à Eum [- is) +( za] 


i 1 
: f exp |- iov -29)] dy 
; where . ` 
Lt 
MET PE T ey? 
and 


4.2. DERIVED PROBABILITY DISTRIBUTIONS 179 


Completing the square for the last integrand above, and then substituting 


s v 
-w=y—-— 
u 


the last integral above becomes 


wo Lo oo 
f exp [— 3 (uy? — 2vy) ] dy = ef» f exp (— daw?) dw 


=o 


It x 
cm 
a 
E 
[3m 
Is 
Nar 


. After some algebraic reduction, the final nadie for the density function of 


Z becomes 


40 = Spray ow [- 7 Ste 
d V/2n (ex? + oy?) 21 Vox? + oy? 
which we recognize is also, à normal density function with mean 
“pa = px + py i 
and variance 
e. x _ og = ox + ay 

By the same procedure, it can be shown that Z = X — Y is also Gaussian 
with mean uz = ux — uy and the same variance as above: oz? = ox? + oy’. 

On the basis of these results, it can be shown inductively that if > 


Z= Yak: 
P i=l 
where a; are constants, and X; are statistically independent normal variates 
N (ux: , exi), then Z is also Gaussian with mean 


uz = >> aux, (4.17) 


i=l 


'and variance 


eg = ator? c (4.18) 
i=l 
In other words, any linear function of normal variates is also a normal . 
variate. The relationships of Eqs. 4.17 and 4.18, however, are not limited 
to normal variates.: We, shall observe later in Section 4.3.2 that these 
equations are, in fact, valid for linear functions.of any statistically inde- 
pendent random yariables regardless of their distributions. 


180 FUNCTIONS OF RANDOM: VARIABLES: -© 


EXAMPLEAT Bon n o e S n Emm Hu 

A trucking network links four cities, namely, (1) Cleveland, (2) New York, 
(3) Philadelphia, and (4) Pittsburgh. The expected travel time for each branch is 
indicated in} Fig. E4:7; "in lioürs.. Assume that the" travel times for each of the 
branches are independently Gaussian, with 20% coefficient of variation. Two trucks 
are dispatched, at the same time from Pittsburgh to. New, York City, with truck.4 
going via Cleveland and truck B via Philadelphia. E aioe " 

(a) What is the probability that truck A. will arrive at the destination within 9 hr? 
. (b) What is the probability that truck A will arrive at the destination earlier than 
truck B? ` 


Solution T . Leet entr A gi 
E T4 be the total travel time for truck 4 Henc 


die bcd oe TAS Ta t+ Tie, i incen ad xt 
which'is a sum of two independent normal random variables. The mean and 
variance of T4 are, respectively, Er ES 
Bp, = Hay td. = 3' +5 = Shr " 
OT DEEL NS z 4 ! 


and. 


(0.2.3635: 4(0.2-585)* 0 oo 
1.36 hr? A 


» 
E 


"Therefore 3$ uM 
i . P(T4 <9) = o( ZO = 0(0.858) = 0.805: 
Rc ( ix) f Lax 
(b) Let Tg be the total travel time for truck B. The event truck A will arrive at the 
destination earlier than truck B is (24.« Tp) or (T4 — Tg < 0). It can be shown 

: that T is normal with ME : 


ESI 


. Gaussian variates with 


4.2. DERIVED PROBABILITY DISTRIBUTIONS 781 


TOR 


, Hence the required. probability is. 


P(Z <0) 


EXAMPLE 4.8 


In considering the safety of a.building, the total force acting on the columns of the 
building must be examined. This would include the effects of the dead load D (due to 
the weight of the structure), the live load Z (due to human occupancy, movable 
furniture, and the like), and the wind load W. d s 

Assume that the load effects on theindividual columns are statistically independent 


5; A8) Wel ermine the; mean. and "standard deviation of; the otal load acting: on a 
column. i 5 Pau VUE E E 

(b) If the strength of a column is also Gaussian with a mean equal to 1.5 times the 
„total mean force, what is the probability of failure of the column? Assume that the 
Coefficient of variation of the strength is 15,%:and that the strength and load effects 
are-statistically-ind. Sie i LS d 
Solution’: 
-@) The-combined-load S is gees 


which is also Gaussian with - 


: Ag = Ap + wy + Hy =4.2 4 65 4 
and + : d Mr 


kip | 


i 


han the applied 


" —336kip |. 
ati nr BI. i eb 0 fuel t 
Hence the probability:of failure jis 


CUTE = CEB) 


-1- 0982-0018. |. 


182 FUNCTIONS OF RANDOM VARIABLES 


Wall Const. Framing — 


START Excavation. © Fig. Const 


Figure E4.9 Construction activity network ' 


EXAMPLE 4.9 


The framing of a house may be done by subassembling the components in a plant 
and then delivering them to the site for framing. While this subassembly of com- 


ponents is being done, the preparation of the Site, which includes the excavation . 


through construction of the foundation walls, can proceed at the same time. These 
activities may be represented with the activity network shown in Fig. E4.9 and 
described in Table E4.9. - 


Table E4.9. Data of Example 4.9 


Completion time (days) 


Activity Description ' Mean Std. dev. 
1-2 Excavation 2 1 
2-3 Construction of footings 1 $ 
3-5 Construction of foundation walls | 3 1 
1-4 Precutting and subassembly of components 5 1 
4-5 ` Delivery of components to site 2 $ 


Assume. that the completion-time of each activity is a Gaussian random variable, 
with the respective means and standard deviations given in Table E4.9. Clearly, 
framing of the house cannot start until the foundation walls are completed and-the 
components are delivered to the-site. What is the probability that this will be at least 


8 days after work started on the job? Completion times among the different activi- ` 


ties may be assumed to be statistically independent. 


Solution 3 
Denote the durations of the activities listed above as X,, Xo, Xa, X4, and X;, 
respectively. Let T, be the total time required to excavate and construct the footings 
and foundation walls, and 7; be the corresponding time for assembly and delivery of 
the subcomponents. Then : i ` 
Ti = Xt X, XS 
Tj — X, t X; 


4.. DERIVED PROBABILITY DISTRIBUTIONS 


The required probability is 


According to Eqs. 4.17 and 4.18, ` e 
Ap, 72 +1 +3 = 6 days 


ET op, = VI +$ +1 =1.5 days 


Er, — 5 +2 = 7 days 
Op, = VI c4 = 1.11 days 
Since T, and T, are also Gaussian, we have 
8—6 
P(T, 2 8) - 1- 05") = 1 — 0(1.33) = 0.0918 ` 
8—7 


PT, 28) =1— (Gr) = 1 — (090) = 0.1841 


Thence, the required probability is 
: p = 0.0918 + 0.1841 — (0.0918 x 0.1841) = 026 
Alternatively, the probability may be calculated by observing that 


P = PIT, >8 U T, > 8) = 1 — P(T: < 8) - P(T; «8) 
= 1 — (0.9082)(0.8159) = 0.26 


two random variables, say Z = XY, we have 


yZ 
Y. 
dr 1 
d; y 
Then Eq. 4.15 yields 
PORSMEESGDE 
i . : z EP A y: y l 


Z = X/Y, the density function of Z would be 


fale) = f^ Mules 9) dy 


— 


183 


p =P(T, 28 U T, > 8) = P(T, > 8) + P(T 28) — P(T, > 8)- P(T > 8) 


Products and quoteints of random variables. For the product of 


(4.19) 


Similarly, for the -quotient of two random variables, for example 


(4.20) 


In this regard, we observe that by virtue of the result for sums (and 


184 


differences) of normal variates, it follows that the product. a an 
of Statiatioa lly independent lognormal - variates 


where the X,’s are statistically-inidépendent log-normal random variables 
with respective parameters Ax, and {x,. Then . : bns 


t coni 
Since each la X; is normal (see Example. 4. 3), it follows ‘that jn Zi js also 
normal with inean and: variance; ‘according to' Eqs. 4:17 and 4:18, as follows: 


x E (n Z)- 


The séttlement of a footing on nsand may be estimated on the basis of the theory of. 


elasticity as follows: e 
to gusti GEF cut 


where | 
S= footing settlement; in feet 


P = average applied bearing presšure in n tons per square foot (tsf) 
B = smallest footing dimension, in feet i 
I = influence factor dependent on footitig geometry, depth of embedment, 


and depth to hard stratum s <ù 
. M = modulus of compressibility 


` E or 
Assume that P, B, I, and M are independent log-normal variates with] parameters Ap, 
An Ap, Ayr and Sp, Sp, by, Ur, respectively. he following values are given for the 
d ign ora particular footing. EY. 


s -Coefficient of 
| io Gyariation >° 


LTRS ar sa yee, 


4.2. DERIVED PROBABILITY DISTRIBUTIONS ie 


zT ln 


Prey gr 


(a) Determine the mean settlement of the footing and its coefficient;of varia! ion., 

(b) If the maximum. allowable settlement is 2.5 in., what is the reliability against 
excessive settlement} that is, proba ity | of nó excéssive settlement? 

(c) If the variability i in M caif bë decfcased by investing for better information, 
say reducing the coefficient of variation to 5% at an ,expense of $100, would you 
spend this money? . Assume that, the exceedance of the maximum allowable settle- 


Solution 


[ 


2 (0.1476 HOA (syn 

= 0.01 + 0.01 + 0,0225 = 0: 
Ap = In (1.0) — 4(0.1)2 = 0 — 0.005 = —0.005 
dy = n6 = 1.792 


Hence 


The:mean:settlement;: -therefores:i yis E 


Hg = exp Üs E Pa 


7 (b) Reliability &P(S & Z3in) ^ 7200. 
= - o (pesa e 29) 


0.20654 ~ 


. $); We. have to; first, determi 
case, 


T = 3.466 — S 3.65 
and i 
Ag = —2.194 
čs? = 0.01 + 0.01. + 0.0025 = 0.0225 
fg = 0.15 f 
Hence the 


- ef 
=1.372 + 2.194 

= Q| —— 
( 10; p) V 


= (4.15) = 0.99998 


i86 FUNCTIONS OF RANDOM VARIABLES 


Assume that the criterion for decision is based on minimizing the expected cost. The 
expected cost of the first design is 


E(Cjj = Cy + (1 — reliability)(cost of failure) 
= Cy + (1 — 0.9986) 50,000 
=Cy+70 


where C, is the fixed initial cost of construction: Similarly, the expected cost of the 
second design is — 
E(C;) = Co + 100 + (1 — 0.99998) 50,000 
= Co +101 


Therefore, on the basis of the expected costs, the decision would be that money 
should not be spent to gather more-information on M. ' 


EXAMPLE 4.11 


A 15-ft-long 4-by-12-in. prismatic cantilever wood beam is carrying a uniformly 
distributed load w (see Fig. E4.11), with a mean load intensity of # = 180 lb/ft and 
a COV of 6, = 15%. The material is structural-grade California redwood with a 
rated average yield strength (parallel to grain under bending) of 4, = 4000 psi and ` 
COV ôs, = 20%. ; 

Prescribe log-normal distributions for w and sy. 

: (a) Determine the probability that the maximum extreme fiber stress in the 
beam will exceed the tensile yield strength of the wood. 

The bending moment at any section of the beam is 


wl? 


2 
. Since wisa log-normal variate, M will also be log-normal (see Example 4.2). 
For a rectangular cross-section, the extreme fiber stress at any section of the beam 


may be given by 
6M 


TR 
where b and / are the width and depth, respectively, of the rectangular beam. It 


follows, therefore, that sis also a log-normal variate. 
Tn the present case, the maximum bending moment occurs at the support, with a 


yt 


Figure E4.11 


4.3. DERIVED PROBABILITY DISTRIBUTIONS 187 


mean value of _ i 
M -24 X 180 x 15? x 12 = 243,000 in.-Ib: 


Hence the mean maximum extreme fiber stress in the beam is 


© 6 x 243,000 


4 =—40a NN = 2531 psi 


and COV : 
. . 6, = 1595 
The required probability is 


Pu = P(4 > 4) = (= > 10) = P(in >.0) 
and since 4 and 4, are log-normal variates, In (4/4,) is Gaussian with mean 
A = (nd 45) — (n2, 323) 
LL ET 
=> - $100.15) — (0.20)"] = —0.45 
and standard deviation ES 


De V6? + 6,2 = VOIS + (0.20 = 0.25 
Thence, i ; 
E EN 0 + 0.45 


BUD 
= 0.036 


(b) Suppose that in order to ensure an adequate level of safety, the probability of 
overstressing the wood beyond its yield strength is not permitted to exceed 0.001. 
Redesign the beam section, keeping the beam width the same at 4 in. (that is, - 
determine A). e 

The limiting condition is 


) =1 — 0(1.80) 


Pine > 0) = 0.001 
4, 


v 


or 
6M i 
P(In Bs, > 0) = 0.001 
But 
In 6M 
ba, 
is also Gaussian with i 
_ , 6(243,000) ` 1 
2 =in F000 - 3 (O15 — (0.20)] 
* 1. 
=In "x — 0.01 
and 
£c2025 


100 TUNYLCIIUIYS Ur KANDUM VARIABLES. — 


Therefore the required limiting condition becomes 
Qi 2) 4-.0.017 ow 
gh O: In (91613 /A) + ae 270601 


Har 


= 91.13 exp (0.25 x 3.09 — 0.01) cet 
= 195.35 rp 2 ot 
Thus : Ta dade emp hi 
98 in. i 


: " T i 
© Deteriniriation of. allowable design stress. We observe that the beam size may be 
determined (or designed) using a mean.allowable stress 44, as follows. 

If we limit the maximum! loads induced s stress to $ Specified allowable stress 44; we 
should have ] : 


Clearly the beam would :be of. slarger- cross-section and thus-safer if we use lower 
values of 4,. However, if 4, is too low, the design would be unnecessarily conserva- 
tive and thus wasteful. In order to ensure an adequate level of safety without being 
overly conservative, ‘the ‘allowable’ stress ag may be determined on the basis of a 
specified tolerable probability pz. For this purpose,,we observe that, for small à, 
and 6,,, 


and: thus 5 è 


from which 


The ratio 2,/2, is called the mean “safety factor”; 
mean allowable stress 


a 


. 189. 


42. DERIVED PROBABILITY DISTRIBUT! 


where 
Applying this 

Thus — . 
m ` Sa = w = wat 


dca 


GM 604300) i. 


ba, 4(1852) 


h = 14.03 in. 


(this value sn ctly the same as that of par ecause of the approximation 
introduced above for D : i 


The central limit theorem. "One of ‘the most: significant theorems 4 in 
probability theory is that pertaining to the limiting distribution of a sum of 
random variables known as the central limit theorem. Stated loosely, the 
theorem: says. that the.sum of a large number-of individual random .com- 
ponents, none of which is dominant, tends to the normal distribution. as 
the number of components (regardless of their initial distributions) in- 
creases without limit. Therefore; if'a physical process is the result of the 
totality of a large number of individual effects, then according to the 
central limit theorem the process would tend to be Gaussian; that is, the 
sum of the individual effects would tend to have a Gaussian distribution. 

. The proof, of the, central, limit, theorem i is, beyond our, scope. of. interest; 
however, the essence of the proof may b be demonstrated with.the following 
example. Suppose that 


MED 


la 


= — = 3 Xi E an cael 
| (SES | 
where the X/'s are Statistically independent and identically distributed 
with PMF, 
P(X:=1 =} | 
P(X; = +1) =4 
and P(X; = s) = 0 otherwise. The factor 1/+/n is necessary to. retain a 


finite variance for S as n — œ. 

Then the probability distribution of S, for increasing values of n(n = 2, 
5, 10, 20), would be as shown in Fig: 4.1 (with the probabilities at specified 
values of S‘spread over the appropriate intervals). 

By virtue of the central limit theorem, the product of a large number of 
independent factors (none of which dominates the product) will tend to the. 


190 FUNCTIONS OF RANDOM VARIABLES 


Figure 4.1 Demonstration of the central limit E" . (Area of each rectangle is 
probability of centered value) 


log-normal distribution. That i is, regardless of the distributions of Xi i the 
product : 
P-c Ilx i 
i=l 


-will approach a log-normal distribution as n — œ. 


Generalization The method described above for a function of two 
variables can be generalized to derive the distribution of a function of n 
random variables. ar if E : 
= 9(Xy Xs, +++, Xn) (4.21) 


then, generalizing Eq. 4.14, we have . 
Fale) = [fef Fenian E sm) dud 


Lo(z1i sz) € 2] 


o Fs Can +++) dz +++ dt, (4.22) 


where g = g7!(z, 2», +++, Tn). Changing the variable of integration from 
a to z, we have 


o afl Ff een egg) 


ðz 


4.3. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES 191 


Therefore 


fale) -f en f fissa st E | dita +++ de, (423) 


4.8. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES 
43.1. Introduction 


According to Section 4.2, the probability distribution of a function of 
random variables can, theoretically, be derived from the probability 
distributions of the basic random variables; however, such derivations are 
generally difficult, especially when the function is nonlinear. In such 
circumstances, the moments—particularly the mean and variance—of the 
function may be the only practically obtainable information. In many 
instances, this may be sufficient. for practical purposes even if the correct 
probability distribution must be-left undetermined. Such moments are 
functionally related to the moments of the individual basic variates, and 
therefore may be derived as functions of the moments of the basic variates. 


Mathematical expectation. The mathematical expectation of a func- 
tion of several random variables can be obtained as a generalization of Eq. 
3.8; thus, for a function of n variables, Z = g(Xi, Xs, +++, Xn), its mathe- 
matical expectation is i 


E(Z) = Blg(X1, Xs, +++, Xa) ] 
= f. er '» gn Ta, ttt, Tn) from. Xn (Ei tt Tn) + 


dm dta +++ dan (4.24) 


Ín the faloxing. wé shall use Eq. 4. 24 8 as the basis for deriving the (first 
and second) moments of linear functions of random variables; the results 
wil be the basis for the first-order approximate moments of nonlinear 
functions. 


4.3.2. Mean and variance of a linear function 


Consider the moments of linear functions. First of all, suppose that; 
Y=aX+b 


where a and b are constants. Then according to Eq. 3.8, the mean value 


Te TY Ma AMA rus E UAE y, 


cf Y is the mathematical expectation of aX + b, or. E 


. EY) > 


: BAX kI) Js (az +b) hon. de 


i = of” x fx(2) mr af fr(2) à dz 


Mei d 2. (4.25) 
Whereas the variance of Y is kast s Mus 
i Var(Y) = = = ECY = up] a= Pas ora 


= EL(aX + b. z Pus m b): 


ME (6 feo) de d. cecus 


- Fur 1 rmore, ifY = Cas + 0X2, whee a ‘and a "m are aai 
according to Eq. 4. PA, 


BC) = a z (ne + oe) fis Cy 22) dea dea | 


C MEC É — EN f CUT P 
The integrals are, respectively, H(X1) and’ E(X»); hence- TE = 


E(P) -aEQG)-aEQG) «^ | (420) 


- T es 
That is, the expected value of a Sim ie the sum of the expected volues. The 


corresponding variance is 


EL (Xs + X) — (amex, + amx)]. 0c 
Eas (X1 — ux) Fas (Xo — wx.) P 

= Elm’ (Xy — px,)? + 2 das (Xi — px.) (Xa — ux;) 
+a? (Xs — nx] 


Var(Y) i 


4 


ES 


Recognizing that the expected values of the first and third terms are 
variances, whereas that of the middle term isa covariance, Eq. 3.72, we 
obtain 


Var(Y) = a? Var(X1) + a? Var(Xs) + 2a44; Cov(Xs, X2) (4.28) 


"and: aly ate cogi iuh cnr Phe 


43. MOMENTS OF:FUNCTIONS OF "RANDOM VARIABLES 493 


< E Y o aX) 2X2, the. results would be: 
Eq = a1 a BCR) 7 — ds LL 


ME = af Var(X) T da Ne) E 2 050; ovt X) ;G 30) 
If x and X are statistically independent, Cov(X,, Xj) = 0 aad bee 


4.28 and 4.30 reduce to 


Var Y)- 


n ore general f y 


E Var(X) + aj Var (Xa) ee (4. a) 


Y =: 3 aX 
e 
where .a; ‘are constants,’ ive: havé; nvextending the 
through 4.30, «5 oci et 
= ae ) 


Ī 5, 


i=l 


Y ad Var(X): P aia; Cov(X;, X) .(483) 


a a aea 

» wari em (4.88a) 

Bere pa Pi is 5 the correlation ‘coefficient between Xi and! Xy Moreover;: WZ 
is another linear function of a gears ad thati is, u 


4 = P 


then the covariance betw een Y sud z can Be show n to be 


Ya; LS + E ai by by Corl, Xj) — (434) 


. Cov(Y,Z) = 
i=l 
= DE t E» Qib;piex,ox; ] 
zh ci d pue 6m iiu 
EXAMPLE 42 


The lengths of two rods will be determined byty two measurements with an unbiased 
instrument that makes random error with mean 0 and standard deviation ø in each 


194 - FUNCTIONS OF RANDOM VARIABLES 


measurement. Compute the variance in the Simon er the lengths T, and T, by 
the following methods: 
(a) The two rods are measured separately. 
(b) The sum and difference of the lengths of the two rods are measured instead 
of the individual lengths. , 


(a) Let M; and M, denote the measurements obtained for the two rods, then ' 
T,-M,te i 
and 
= Mz + è 
where s, and e; are the errors involved in the measurements. Then the variance in the 
estimation of T; is 
Var (Ty) ="Var (M, + &) = Var (Mj) + Var (e) = 0 + o = o 


Similarly, Var (7;) = Var (M; + &) = 0%, 
(b) Let M; denote the measured combined length of the two rods, and M, denote 
the measured difference between the lengths of the two rods; then 


Ti + Ty = M; + & 
and 
Ti- nh =M, +e 
Solving thèse two equations simultaneously, we have 
iM tM +Ë + & 
ERST 2 
and 


Assuming that the errors & and e, are statistically independent, the variance in the 
estimation of T, is, therefore, ; 


“Var (T) = Var Ca " AS 


2 2 
= Var re 7) + Var ez = *) 
2 32 
1 
=3[ [Var (e) + Var el.: 
wl z 0d 
=g x.2o? = 7 


, Var (T3) = Var E + Var pes) à 


-i [Var (s) + Var (e)] 


Similarly, 


" o? 


N 


4.3. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES 195 


Therefore we see that the second method of measuring the lengths of the two rods is 
better, since the variances in the estimation of the true lengths T} and 7; are smaller. 


EXAMPLE 4.13 


The total vertical load on the ground- floor columns of an n-story building would 
be the sum of the individual contributions from each of the n floors; thus 


Y-3x 


where X; is the load on the column from the ith floor. Assuming the mean and 
variance to be the same for all floors (this appears to be the case from actual load 


. surveys [Mitchell and. Woodgate, 1970), the mean load on the column is 


"y = nyx 
and the variance, from Eq. 4-33a, is 


Var(Y) = n Var(X) + Var(X) > $ Pij 
Ti 


where pj; is the correlation between the loads on the ith and jth floors. ` 
(a) If the loads on any two floors are assumed to be statistically independent, that 
is, pj; = 0, then Var(Y) = n Var(X), and the standard deviation would be 
oy = Vn Ox 
and COV . z 
Vicy 3x 
nux Vn 


ôy = 
The design load is usually specified to be on the high side; suppose that this is 


taken at k standard deviations above the mean. The design load for n floors then is 
3 Y* = uy + koy 


ô 
-arhi +42) 


ôx 
= a(t Ez uy 


This means that the total design load on the ground-floor column increases with the 
number of stories; however, on the average, the load from each floor is 


ys ox’ 

Poe 
which means that the contribution from the: individual floors decreases with the 
number of floors in the building.: : 


The reduction factor that specifies the load contribution from each floor, which 
can be defined as r = Y*/nX*, therefore, becomes 


bx bx\- 
va zi) B ( ttv 
mux +kôg) | V1 t Kóx 


196. FUNCTIONS OF RANDOM VARIABLES, ee HIS ipm 


(b) However,-if the correlation.between any two floors is the same and positive- 
(that is, pj; = = (pr | positive. constant for any /and/), then the variance of Y ‘becomes; 


Var(Y) = n Var(X) + Var(X)In(n — 1)p] 
= Var(X) [n + n(n — 1)e] 


.Rc*tG-De.., 
ers a ` i 


Tn this case, the reduction factor would be 


1+ kòy = RED 


BEI 


which is git thàn the corresponding factor obtained fatlier assurniüg statistical 
independence. Any correlation between the loads on different floors may be ex- 
pected to be positive; on this basis, therefore, the assumption of, statistical i ^ 
dependence would yield results on the unsafe side. i 


and 


If n. random "variables X Xy pian: X are e statistically independent, 
mean value of their product TETRI = : pe 


` Z = XXT. Xn 


= E(X) E(X:) > EG, > 


Therefore 


Similarly, we cai $ | 
(P) = EQ) BE) « SEQQ) 


m JEET 


For a general function of a random: variable X, that is 
Y -g(X) 


Nact)z qu ri ms P - A PU "eM 


uas E(x )e mit sme Ge ee) my 


4.4. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES- 197.: 


the exact moments of Y may be obtained as the mathematical expectation. 
of g(X); according to Eq. 3. 8, the mean and y variance would be 


dy = Jt Okad 


"CE - WO = BOD Mee) de 


Obviously, to obtain the mean Vind variance of. the function.. Y with the 


above, relätions; ‘information on fx(z). is needed., In many applications, 
however,. the; density-function fr(z) may not be known; information may 
be.limited to the mean and variance of the original variate X.-Furthermore, 
even when fx(x) is known, the integrations indicated above may be dif- 
ficult to perform. For these reasons, approximate mean and variance of the 
function Y would be practically useful and may be obtained as follows. 
-Expand.g(X).in.a Taylor series about the-mean value ux; thus- 


Y = g(ux) + (X — n) TY ets Tea ee are (4.87) 
where the:derivaüves are evaluated at gx. 
Tf the series is truncáted- at- the linear terms, we obtain the first-order 


approximate” mean and’ variance of Y: 


D EQ m gli) to 37 nas) 
and: . : 


da n ~ s Las (2y 
~ Var(X) a) , ^ — (439) 


We can observe that if the function g(X) is approximately linear for the“ 
entire range of values of X, Eqs. 4.38 and 4.39 should yield good ap- 
proximations of the exact moments: (Hald, 1952). Moreover, when the 
variance of X is small relative to g(ux), the above approximations should be. 
adequate for mány practical purposes even when the function is nonlinear. 

The above first-order approximations may be successively improved by 
including the higher-order terms.in the Taylor series; for example, if the 
second-order term in Eq. 4. 871 is included, the second-order approximations 
are accordingly 


E(Y) e g(ux) +4 Var(x) 2 (4.40) 


a 


198 FUNCTIONS OF RANDOM VARIABLES 


and 
2 


Var(Y) = — (x) - 5 Vanr(X) (55) 


ux)? dg Pg” 
dX dX? 


E(X +i ax (25) (4.41) 
+ E( 1 ux)* as z 
Such improvements, of course, would involve, the higher moments of the 
original variate in the evaluation of Var(Y), as shown in Eq. 4.41, in 
which the third and fourth central moments of X are involved. 

For practical purposes, of course, we may use Eq. 4.40 for E(Y) and 
Eq. 4.89 for Var(Y); in this way, we can take advantage of an improved 
mean value for Y, without involving more than the mean and variance of 
X. . 


EXAMPLE 4.14 


The maximum impact pressures of ocean waves on coastal structures may be 
determined by : 


past = 2.7 —— = 


where p = density of water; K = length of hypothetical piston; D — 'thickness of 
air cushion; and U — horizontal velocity of the advancing wave. 

Suppose that the mean crest velocity is 4.5 ft/sec’ with a COV of 20%: The density 
of sea water is about 1.96 slugs/cu ft, and the ratio K/D = 35. Determine the mean 
and standard deviation of the peak impact pressure. 

According to Eqs. 4.38 and 4.39, we obtain 


E(pmax) = 2-7(1. 99854. 5)? = 3750.70 psf 
= 26.05 psi 
and 


Var (Pmax) = e Tes 5) eur Var (U) 


= (2.7 x 1.96 x 35 x 2 x 4.5)?(0.20 x 4.5)? 


Thus : 
= 2.7 x 1.96 x 35 x2 x 0.2 x (4.5)? 


= 1500.3 psf = 10.42'psi 


Thus the COV of the maximum wave pressure is 0.40, which is twice that of the wave 
velocity. 


Tomax 


If. Y is a function of several random variables, that is 
Y.— gU, Xs... Xa) 


we obtain the approximate mean and variance of Y similarly as follows: 
Expand the function g(Xi, X», ..., Xa) in a Taylor series about the 


4.4. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES 199 . 


mean values ux, ux; ..., px,; in this case, we have 
: = i 
Y = g(uxy.uxs -. mx) + >> (Xi — ux) 
ins ðX: 
1A‘ 2 og 
ase X; age) (CX S — ; 
gud (X; — nx) (X; — ux) xa, t (4.42) 
where the derivatives are evaluated at ux, uxs, . . . , Ux, 


Truncating 1 the series at the linear terms, and by virtue of Eqs. 4.32 and 
4.33, we obtain the first-order approximate mean and variance of Y as 
follows: 


E(Y) ~ g(x xo - ux.) (4.43), 


which says that the mean of the function is equal (approximately) to the . 
function of the means; and 


Var(Y) ~ Dic? Var(X) + 3335 eic; Cov(X;, X5) (4.44) ^ 
ia dej 
where c; and c; are the values of the partial derivatives.0g/8X ; and 0g/0.X, h 
respectively, evaluated at ux,, uxs, . . . , ux,. Observe that if X; and X; are 
uncorrelated (or statistically indeperident) for all ¢.and j, then Eq. 4. 44 
reduces to 


Var(Y) = Xo Va(X): — (4.44a) 


Again, the conditions for the applicability of the above approximations 
are the same as those stated earlier for the single-variable case. The above 
approximate mean and variance may also be improved by including the, 
higher-order terms of the Taylor series expansion of g X, Xs +2, Xn). 
In particular, from Eq. 4.42, the second-order approximate mean of Y 
would be , f 


E(Y)= nim HX) +++ y BXn) 


ub DS (2 2x -) Cov(X,, X;) (4.432) 


i=) j=1 


where the derivatives are evaluated at BXv Xe, ; ++) BX, Again, if X; and 
X; are uncorrelated, Eq.. 4.13a becomes 


E(Y) = gluxy uxo... ux) +5 5 EG 3) veran) (4.435) ` 


Equation 4.44 is the basis of error propagation analysis in measurement 
theory (Jordan, Eggert, and Kneissel, 1961; Richardus, 1966). However, 


.200  — FUNCTIONS OF RANDOM VARIABLES ^" * 


it is also a useful approximation: for’ many other engineering ‘problems 
(Cornell, 1969; Ang, 1973); in particular, it is the basis for the general 
analysis. of uncertainty aS presented. i in: Vol. II.. ; 


hm 


Figure E4.15 


EXAMPLE 4.15 


Consider a 5- -meter-high column supporting a load S, which is inclined at an angle 
6 from the vertical as shown in Fig: E4.15: Here S arid 8 are random variables with 
respective means and standard deviations. ` 


S'—100Newtons, ^ og'= 20 Néwions ^^ ^ ^ ' 7 
6 = 30°(0,524 rad), — ey = 5°(0.087 rad) 


Determine the mean value and standard deviation of the max: imum bending moment. i 
on the column induced by the inclined load. Assume that. 5 and 0 are statistically ; 
independent. 

The maximum bending moment occurs at the fixed base of the column, which is 
(see Fig. E4.15) 


M =AS sin 0 


Therefore, on the basis of Eq. 4.43, the first-order approximate mean bending 
moment is 


MahSsind —— 
— 5(100) sin 30? 
. = 250 Nm (Newton-meter) 
The corresponding.variance, according to Eq. 4.44a, is 
ont cx og sin 0 + ag? (13 cos 0)? 
= (20)*(5 sin 30°)? + (0. ose 5 x 100 cos 30 Py 
:= 2500 + 1420 ` 
= 3920 (Nm)? 


4.3. MOMENTS OF FUNCTIONS OF RANDOM VARIABLES 301 


yielding a standard deviation EM 
oy, = 63 Nm . C LU 


The accuracy of the estimated mean bending moment may be improved by using the 
second-order approximation of Eq. 4.43a; thus 


[74 UQ/SMV S [eM 
M ce hSsin 6 + = (s) +5 (Fer sa) ° Ge ‘+ (5x) Cov (S, 9) 
= 250 — $(5h sin 0) G?° 7 
250 — 4(100 x $ sip. 30 0.089 .. aga ent 
209m | 5 E ; - 


EXAMPLE 4.16 ... 


A 2-span bridge a across a 400-ft;wide river is to be built with a tenter pier about 
200 ft- from.one-bank-of the river. To:-locate the center position of the pier; a base line 
B is established along one bank as shown in Fig. E4.16 and the pier position is 
determined by! intersecting the lines of. sight from stations « a z and 5. with 6, fixed at 
90. bm 

Suppose that the pier is to be locate 200 ft from the base-line, which has'a 
measured mean length B = 300 ft and a standard deviation og = 1 in. 

If the angle 0, measured from station b.has 6, = 33°40’ and eg, = 2! what are the 
mean and standard deviation of the measured distance D to the pier I location? 


D = Bian b, ] 

Thus Lo nye! a 

= 300 tan 33. 667° . . . 

498.82 ft. - uh actenus 
op? = (tan ior + &È sec? ‘bg, 

= 0.444¢4,)? + (300 x 1 m 818. 

= 0.0666 fi? ] 
Or. z 


2p œ 0.2581 ft = 3.10 in. ioe ETE we 


River Flow 
—— 


Center. OF 


‘Figure E416" 


202 FUNCTIONS OF RANDOM VARIABLES ' 


EXAMPLE 447^ 


The capital cost (ín $1000) of a combined municipal tivat 
ese capital cost (in pal activated sludge plant may be 


C, = 583Q"94 + (110 + 379) 2e 


+074 230)( - 1) 


in which Q is the flow rate in million gallons per day (mgd); S, i i i 
C e l gd); S, is the biological 
concentration of influent BOD (biological oxygen demand) ín milligram per fiter 
(mg/l); and S, is the concentration of suspended solids (in mg/l). 
Suppose that a waste water treatment plant is needed for the following conditions: 


mean flow rate, Q = 5 mgd 
, mean BOD concentration, 5, = 600 mg/l 

mean concentration of suspended solids, S, = 200 mg/l 

with coefficients of variation 30%, 20%, and 15%, respectively 


Determine the àverage capital cost of the plant, and corresponding standard - 


deviation. E 


C, c 583 (5) + (110 + 37 x 5 (25) + (77 +23 x »(5; - 1) 


= $3;138,219 


A 375, |? 
[E [0292-5 + 2] ag? 


110 + 370Y TI + 23QY 
t C ss? + C2) ^s 


37 2 5 2 
= [5830895702 ; w] (1.5)? 4 (= S 5) a20} 


71 +23 x SY. a 
* eu Go 
= 539,213 + 31,329 + 829 
= 571,371 


Therefore, cc, = $756,000 and the COV is ôg, = 0.24. 


4.4. CONCLUDING REMARKS EE 


In this chapter, we saw that the probabilistic characteristics of a function 
of random variables may be derived from those of the basic constituent 
variables. These include, in particular, the probability distribution and the 
main descriptors (mean and variance) of the function. The derivation of 
the distribution, however, may be complicated’ mathematically, especially 
for nonlinear functions of multiple variables. Therefore, even ‘though the 
required distribution may (theoretically). be derived, they are often im- 
practical to use, except for special cases (for instance, linear functions of 


PROBLEMS zuJ 


independent normal variates). In view of this, it is often necessary, in 
many applications, to describe the function in terms only of its mean and 
variance. Even then, the mean and variance of linear functions are amen- 
able to exact evaluation; however, for a general nonlinear function, (first- 
order) approximations must often be resorted to. In this chapter, we have 
introduced and developed the elements for such first-order analysis; these 
concepts will form ‘the basis for the formal analysis of uncertainty covered 


: in Vol. If. 


PROBLEMS 


Section 4.2 


'43 The force in the cable of the truss shown in Fig. P4.1, when subjected to a load 
W, is given by i 
VE +P 
Fa = = 
(a) If the load W is a normal variate N(#yp, ow), derive the density function ` 


^ of the force Fae- : . 
(b) If uy = 20 metric tons, oj, = 5 metric tons, and h = M, what is the 
- probability that the force Fac will exceed 30 tons? Ans, 0.0934. 


4.2 A dike is proposed to be built to protect a coastal area from ocean waves (see 
“Fig. P4.2). Assume that the wave height H is related to the wind velocity by 


the equation 
s eq H 02V 


where H is in meters and V is velocity in kilometers per hour (kph). The 
annual maximum wind velocity is assumed to have a log-normal distribution 
with a mean of 80 kph and 4 coefficient of variation of 15 %. : 
(a) Determine the probability distribution of the annual maximum wave 
height and its parameters. ; . . . 
(b) If the dike is designed for a 20-year wave height, what is the design height 
of the dike? . . 
. (c) With this design, what is the probability that the dike will be topped by 
waves within the first three years? : 


Figure P4.1 Figure P4.2 


204 FUNCTIONS OF RANDOM VARIABLES 


4.3 In Example 4.14, the maximum wave pressure oi structürés is 


Liu PK, 2 oi i prolul: tp 
2 nU : 


t 2 dress 
egt ENG Gane 


“eten e Pmax 


UL where. U is the horizontal velocity of the advancing wave; — i00 Cien 
:(a) If U has a log-normal distribution with parameters 4y and £j; derivethe 
.. . distribution of Pmax using Eq. 4.8. NP ipei oss. ntoldiew 
(b) Using the data given in Example 4.14, determine the probability t 
maximum impact pressure will exceed 40 psi. 4ns.0.121. ~~ 
44 The hydraulic head loss hy in a pipe due to friction may be.given. 
Darcy-Weisbach equation ET 


moet 72 
gx TAIN NET c 
where L and D are, respectively, the length and diameter: e f is the 


Pipes f 
- friction factor; and V is the velocity: of flow in the pipe. If V has an exponen- 


tial distribution with a mean -velocity-vp, derive the density function for the 
head loss hy. S a 


45 From the statistics collected for towns and:cities-in:IHinois the average con- 
sumption of water, in gallons per capita per day, is found to increase with the 
:size of population Pas follow: Yes =: H 


Suppose that the population in 1974 for a certain developing town:can be 
described by a log-normal distribution with a mean of 10,000 and a COV of- 
;s 5%. It is expected that the median of the population will gr 

, 1974 population) per year, while'the COV willt 


(a) Assume thàt the dist 
^ time; determine thé di 


NE GE 
. 1974 


Figure P4,5 ^ 77 5- d dcn 


Une sw PROBLEMS 205 


I ime of jate-with.a mean 
he.time-of travel -between -cities anc a:normal variate witha 
eioh coefficient.of variation of 0.10 (sec Fig- P4.6):-The time of 


i ture consisting of a cantilever beam AB and a cable BC is used - 

z DA S.(see Fig. P4.7). The magnitude of the load varies daily, pe 
its monthly maximum „has been observed to be .Gaussian .with. a mean o 
25,000 kg, and a.coefficient of variation of 30%. Ix : s " 

. (a) H the cable BC and beam AB arc, designed. to-withstand a. 10-mont 

: maximum load (that is, a maximum load with a, return period of 10 
months) with factors of safety.of;1.25.and_1.40, respectively, what are 
the probabilities of failure of the cable and of the beam? -i 


Clearance 


Figure P4.7 000  FigurePt9 > 


206 FUNCTIONS OF RANDOM VARIABLES 


(b) Assuming statistical independence between the failures of the beant and 
cable, what is the probability of failure of the structure (that is, that it 
will be unable to carry the load)? g 

(c) If (instead of part [a]) the strength of the cable were random N(50,000 


kg; 10,000 kg), what would be its failure probability under the load S$? 


48 The occurrences of hurricanes in a Texas county is described by a Poisson 
process. Suppose that 32 hurricanes have occurred in the last 50 years; 28 of 
the 32 hurricanes occurred in the hurricane season (August 1to November 30). 

(a) For this Texas county, estimate the mean rate of occurrence of hurri- 
canes (i) per year; (ii) per month in the hurricane season ; (iii) per month 
in the nonhurricane season. : 

(b) A. temporary offshore structure is to be located off the coast of this 
county and it is expected that the structure will opérate for 19 months 
between April 1 and October 31 of the following year. What is the mean 
number of hurricanes that will occur in this period of time? 

(c) What is the probability that this structure wili be hit by hurricanes 
during its period of operation? — . i : : 

© (d) Suppose that whenever a hurricane occurs, the owner of the structure 
will incur a loss of $10,000, which includes repairs for damage, loss of 
revenue, and so on. What is the owner's expected total loss from 
hurricanes? The total loss T (in dollars) is given by 


T = 10,000 N 


‘where N is the number of hurricanes during the period of operation, 
which is assumed to be a Poisson random variable. : 
(e) What is the probability that the total loss T will exceed $10,000? 
4.9: To insure proper mounting of a lens in its housing in an aerial camera, a 
‘clearance of not less than 0.10 cm and not greater than 0.35 cm is to be 


allowed. The clearance is the difference between the radius of the housing and [ 


the radius of the lens (sec Fig. P4.9). 
A lens was produced in a grinder whose past records indicate that the radii of 
_ Such lenses can be regarded as a normal variate with mean of 20.00 cm and a 
coefficient of variation of 197. s 
A housing was manufactured in a machine whose past records indicate that 
the radii of such housing can be regarded as a normal variate with mean’ of 
20.20 cm and a coefficient of variation of 2%. What is the probability that the 
specified clearance will be met for this pair of lens and housing? Ans. 0.216. 
4.10 The safety of a proposed design for the slope shown in Fig. P4.10 is to be 
analyzed. Suppose that the circular arc 4B (with center at 0) represents the 
potential failure surface and that the wedge of soil contained within the arc 
will slide if the clockwise moment about point 0 due to the weight of the soil 
W exceeds the counterclockwise moment provided by the frictional forces F, 
and F}. The following information is given: 


Mean Standard deviation 


(kips) . (kips) 
Ww 400 60 
Fy 100 30 
F 300 60 


PROBLEMS 207 


Figure P4.10 


(a) Let Mr = total resisting (counterclockwise) moment. Determine 
E(Mr), Var (Mr). aes . 

(b) War i T probability that sliding -along the arc AB will occur? 
Assume that W, F,, Fp are statistically independent normal un 


variables. s 
í i i f the 

il tank is proposed to be located as shown in Fig. P4.10. I 
- ect piel il probability of sliding failure is 0.01, how heavy 
can the oil tank be? wae 

ter supply to a city comes from two sources—namely, irom 
- Lad ion pumping underground water, as shown in Fig. P4.11, For 
the next 3 months, the amounts of water available from each source are 
independently Gaussian N(30,3) and N(15, 4), respectively, in million 
gallons. Suppose that the demand in the next 3 months can be described by the 

bability mass function given in Fig. P4.11. — ! 

b Determine the probability that there will be insufficient supply of water 


in the next 3 months. : R 
(b) Repéát part (a) if the demand is also Gaussian with the same mean and 


variance as those of Fig. P4.11. x 
4.12 The traffic on a bridge may be described by a Poisson process with mean 


v 


Reservoir 
N(30,3) 


Pumping i 
N (15,4 


PMF 


Demand In Next 3 Months (in m.g.) 


Figure P4.11 


218 FUNCTIONS OF RANDOM VARIABLES 


| 


Figure P4.34a 


is composed of two components—the settlements of the sand and clay strata. 
. The flexibilities (that is, inches settlement per foot of strata per ton of applied 
load) of the two strata, denoted Fy and Fo, are independent normal variates 
N (0.001, 0.0002) and N (0.008, 0.002), respectively. The total column load is 
W, which may be assumed to be statistically independent of Fg and Fe. 
(a) If W = 20 tons, what is the probability that the total settlement will 
exceed 3 in.? Ans. 0:007. Doc 


(b) Suppose the load W is also a random variable with the PMF given in 


Fig. P4.345. 


"Figure P4.34b 


With this PMF of W, determine the mean. and variance of the total 
settlement by first-order approximation. In this case; what would be the 
° probability that the settlement will exceed 3 in.? Ans. 2.2; 0.51; 0.058. 


5. Estimating Parameters From 


Observational Data 


5.1. THE ROLE OF STATISTICAL INFERENCE 
IN ENGINEERING 


We have seen in the previous chapters that once we know (or assume) the 
distribution function of a random variable and the values of its parameters, 
the probabilities associated with events defined by values of the random 
variable can be computed. The calculated probability is clearly a function 
of the values of the parameters, as well as of the assumed form of distribu- 
tion. Naturally, questions pertaining to the determination of the param- 
eters, such as the mean value wand variance o°, and the choice of specific 
distributions are of interest. 

Answers to these questions often require observational data. For example, 
in determining the maximum wind speed for the design of a tall building, ` 
past records of measured wind velocities at or near the building site are 
pertinent and important; similarly, in designing a left-turn lane at an 
existing highway crossing, a traffic count of left turns at the intersection 
may be required. Based on these observations, information about the’ 
probability distribution may be inferred, and its parameters estimated 
statistically. ' ‘ g 

In many geographic regions, data on natural processes, such as rainfall 
intensities, flood levels, wind velocity, earthquake frequencies and magni- 
tudes, traffic volumes, pollutant concentrations, ocean wave heights and 


. forées, have been and continue to be collected and reported in published 


records. Field and laboratory data on the variabilities of concrete strength, 
yield strength of steel, fatigue lives of materials, shear strength of soils, 
efficiency of construction crews and equipment, measurement errors in 
surveying, and many- others, continue.to be collected. These statistical 
data provide the information from which the probability model and the 
corresponding parameters required in engineering design may be developed 


. or evaluated. ] 


- The techniques of deriving probabilistic information and of estimating 
parameter values from observational data are embodied in the methods of 


219 


-~ BJA ARAA NNF L AIRE OND EARUM UDOERAY ALIVINAL DATA 


Real World 


Data Collection 


Statistical 
Inference 


Estimation Of Parameters, 
Choice Of Distribution 


Calculation Of Probabillties , 
(Using the prescribed distributions , 
Gnd estimated parameters) 


Information For 
Decision —Making 
And Design 


Figure 5.1 Role of statistical inference in decision-making process : 


statistical inference, in which information obtained from sampled data is 
used.to make generalizations about the populations from which the sam- 
ples were obtained. Inferential methods of statistics, therefore, provide a 
link between the real world and the (idealized) probability models as- 
sumed.or prescribed in a probabilistic analysis. The role of statistical in- 
ference in the decision-making process is ‘schematically shown in Fig. 5.1. 
This chapter is devoted to the estimation of statistical parameters; the 
subject of determining probability distribution is covered in Chapter 6. 
Although there are other inferential methods of statistics, only those that 
are most basic and of wide applications in engineering are discussed here. 
These include. principally the methods of estimation (point and interval 
estimations) in this:chapter, determination of probability distributions in 
Chapter 6, and regression and correlation analyses in Chapter 7. Chapter 8 


presents the Bayesian approach to the estimation problem. The more: 


esoteric topics of statistics such as design of experiments and analysis of 
variance are not covered. Moreover, we shall not-dwell.on such theoretical 


questions as the.unbiasedness, efficiency, consistency, and sufficiency of an _ 


E 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 221 


estimator; only the concepts underlying the above. methods are developed, 
and their significance to engineering problems are emphasized and 
illustrated. 


5.1.1. Inherent variability and estimation error 


lt may be emphasized that even when the distribution function and, its. 
parameters of à random variable are known, we still cannot predict with 
certainty the occurrence (or nonoccurrence) of specific events. At best, we 
can say that an event will occur with an associated probability. The under- 
lying uncertainty, in this case, is due to the inherent randomness. of the 
natural phenomenon. However, uncertainty arises also from the inac- 
curacies in the estimation of the parameters and in the choice of the distri- 
bution. For example, when available data are limited, the estimated mean 
and variance may not be accurate and the distribution function determined 
on the basis of available data may not be the most appropriate. Such errors, 
therefore, would contribute additional uncertainty. ! 

Uncertainties associated with errors of parameter estimation.can be 
reduced by increasing the amount of data, whereas the uncertainty associ- 
ated with the inherent variability may remain unchanged or may even. 
increase with additional data. f 

More generally, errors. would also include inaccuracies of modeling and 
prediction. For example, when an idealized mathematical equation is used 
to evaluate an engineering system, or its response to specified input, the 
imperfection of the mathematical model gives rise also to further un-- 
certainty. Such imperfections may be due to factors whose effects were not 
explicitly reflected in the model, or to gross ídealizations necessary for 
mathematical tractability and urgeney of engineering solutions. x 

In generál, therefore, we shall consider uncertainty to be the result of 
(inherent or- natural) variability as well as of (prediction) error. A general 
model for the systematic assessment and analysis of uncertainty js de- 
veloped in Vol. II. 


5.2.. CLASSICAL APPROACH TO ESTIMATION 
OF PARAMETERS : 


Classical estimation of parameters is divided into point and interval 
estimation, Point estimation is concerned with the calculation of a single 


“number, from a set of observational data, to represent the parameter of the . 


underlying population; interval estimation goes further to: establish & 
statement of confidence in the estimated quantity, resulting in the de-. 
termination of an interval indicating the range wherein the: population: 
parameter may be located (with the associated corifiderice). 


` 222 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


‘REAL WORLD "POPULATION" 
(True Characteristics Unknown ) 


Sampling 
— x (Experimental 
lariable Observations) 


Real Line -© < x < +0 
With Distribution fx(x) 


Inference 
y bi ll i 


E TE: 25 
Mean p = X Statistical Xe Ex, 
Variance c? = s? Estimation ste qr EH -xYc 


Sample [kexa n dii I 


Theoretical Model. — 


Figure 5.2 Role of sampling in statistical inference ' 


5.2.1. Random sampling and point estimation 


As alluded to earlier, the parameters of a probability model may be eval- 
-uated-or estimated. only on the basis of a set of observational data obtained 


from the “population”; such a data set represents a sample of the popu- ` 


lation, and thus the value of à parameter. calculated on-the basis of the 
sample values is necessarily an estimater of the parameter. In other words, 
the exact values of the population parameters are generally unknown; 
the best that cán be done is to make estimates of these values by sampling 
the population. As indicated in Fig. 5.2, the real-world population may be 
modeled by a random variable X, with probability. distribution fx (zx) and 
associated. parameters, for example u and ø in the case of the normal dis- 
tribution. The form of fx(z) may be derived on the basis of physical con- 


siderations, or determined empirically, as described in Chapter 6. Invari- | 


ably, however, the parameters such as p and e must be estimated . from 

: sampled observational data. ` nC 
Given. a set of sample values, there are different methods for estimating 
“the parameters; among these are the method of moments and method of 
maximum likelihood. Regardless of the method, estimation of parameters 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 223. 


is necessarily based on a set of;sample values, say zi ---, tn, called a 


sample of size n from the population X; Usually such a sample is assumed 
to constitute à random. sample; this means that the successive sample 
valués are independent and the underlying population“(or the distribution 
of X) remains the same from one sample value to another. 

We should point out that there are certain properties that are desirable 
of a point estimator: the properties of unbiasedness, consistency, efficiency; 
and sufficiency: If the expected value of an estimator is equal to the param- 
eter, the estimator is said to be unbiased. Unbiasedness, therefore, implies 
that on the average the value of the estimator will be equal to the param- 
eter; however, nothing is said about whether the individual values of the 
estimator is close to the parameter itself. On the other hand, the property 
of consistency implies that as n — «o, the estimator approaches the value 
of the parameter. Consistency, therefore, is an asymptotic property— 
practically, it means that the error in the estimator decreases as the sample 
size increases. Efficiency refers to the variance of the estimator; if every- 
thing else is equal, an estimator 6, is said to be. more efficient than another 
&, if 6, has a smaller variance than that of 6. Finally, an estimator is said ` 
to be sufficient if it utilizes all the information in a sample that is pertinent 
to the estimation of the parameter. 

In practice, however, it is impractical to require all or many of thesé 
properties; seldom, in fact, is there an estimator that possesses all the above 
properties. In the following sequel, we may refer to one or more of these 


. properties in connection with specific estimators. 


The method of moments. In Chapter.3, we saw that the mean and 


variance are the main desériptors of a random variable. These are related 
to the parameters of the distribution, as shown in Table 5.1 for a number 


. of common distributions. For example, in the case of a normal random 


variable; the parameters u and o? of the distribution are also the mean and 
variance of the variate, whereas, in the case'of the gamma distribution, 
the parameter s/v and k are related to the mean and variance as follows: 


) pi EQ) = k/v 

f Var(X) = k/? 
On the basis of the relationships between the moments of a random variable 
and the parameters of the corresponding distribution, such as those shown 


in Table 5.1, it follows that the parameters of a distribution may be ‘de. 
termined by first estimating the mean and variance (and higher moments. 


~ if necessary) of the random variable. This, in essence, is the basis of the 


method of. moments. 
“Intuitively, it seems that the’ sampl momenis may be used as estimates 


du = [rr ; 


aueney pus UBIN 0} uoryepor 


siajouure - 


au - »«( 3 - ui 
, Ienüourg 


(IWA). topun ssew 10 (Iqa) 
uonouny Lysuap Aynqeqoig 


225 


(IH HE Pb 
20-9 3 ) a ona Q3z5» ~ 
: 5 469% rep 5(0 = 9 (4g n 
0- oH +0= (yg "rr q TON wg 
(ng — n» ~ qo — 
et T o (nk. q»2xn G3 pod 
VESTE: 
A ; 3 
E. , ee v~ n9.— : 
(n toto I (a nq o nSrSn G = )" z A G)X/ mwng 
„= 9) SL = (y)sA 
: 2 as . pies ; 
&/( + 0) = xa dH (5529 t= @Sf ^ uowg 
[4 
i oG - ) = (xis, ? s 
5 I» "X P 
l i LG- je SFTS Pey 
£N. 
E = Om ` 
: 3 0x à 
i= 8 ODur = (xA 3 j 
: : t € o sjuz : 
GHEH Y dx = (pg d M [G Sew ) e [ew ns = (Xf —— qsuuousoT 
T e - —————— M À— 
z2 = (X)mA ec > E> wo ~ 
of E) “(weiss : 
w= (X) ? [Gc z) : - | dsa ? XA =: (2)xf pede) 
$ . 
V4 = (Y). : i 
"AT QD as ex. LP Ol Qu c 
Xe iE adyala y TRO 
a = (XBA 
Y 5 x 
= Q4 OIE qu q^ (Xf — qunuanodss; 
BECA TTS 
i " ae . 
m= (yag . nu- Gi - (a UOSSIOq 
id/(d — 1) = (Xea ED De " 
at = (X) d FUR "o 
1s (d. — 2 ~ Gd c ougatuoar) 
= 1) du = (pasa EG LE 
ü 


vorngrystge 


sisqourereg Taur, pue Suonnqrsir Uouruo?) "T$ aque 


226 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


of the some pondinr moments of the random variable. In this regard, just 
&s the mean and variance are the (weighted) averages of X.and (X — au», 

the sample mean and sample variance can be defined-as the respective 
averages of a sample of size n, namely 2j, **5, Zn, as follows. 


i= s S E . . (51) 


'and 


é-ike-»s is 


Accordingly, Z and s! are the point estimates of the population mean p 
and population variance ¢?, respectively. 


` After the mean and variance of the random variable (or higher moments, 
` if required) have been estimated, the parameters of its probability distribu- 
-tion can then be determined; for example, through the relationships given 

` in Table 5.1 (see Example 5.2). 
It should be pointed out that Eq. 5.2 is a “biased” estimate (Freund, 
1962) for the variance. This bias can be removed by dividing the sum of 


squares with (n — 1) instead-of n (see Eq. 5.32) ; thus the unbiased sample 
` varisince, , 


is preferred over Eq. 5.2. Of course, for ius ny there i is little difference 
between the two estimates'of Eqs. 5.2 and 5.3. 


.By expanding the squared terms, it can be shown that Eq. 5.3 may also 


be expressed as ) 
ires E È ae ~ nit) (5:30) 


del 


EXAMPLE 5.1 


Consider the datà c on the crushing strength of concrete for 25 spécimens listed in 
"Table E5.1. To determine the values of its mean and variance pand o?, We apply 
Eds. 5.1 and 5.3a, obtaining ` : 


. ni: xı = 5.6 ksi 
„and 


SX i£» 2 ass] = 044 (ksi)? 


On the basis of the data, the mean and variance of the crushing strength’ of the 
concrete are, 5.6 ksi and: 0.44 (ksi)?, respectively. 


(5.3) . 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 227 


Table £5.1. Sorption of Mean and ` 
Variance for Concrete Crushing Strength 
of Example 5.1. 


Specimen ' 3 
number X; x? 

1 5.6 31.36 
2 5.3 28.09 
3 4.0 16.00 
4 44 19.36 
5 5.5 30.25 
6 5.7 . 3249 
7 6.0 36.00 
8 5.6 31.36 
9 74 50.41 
10 47 22.09 
11 5.5 30.25 
12 5.9 34.81 
13 6.4 40.96 
14 5.8 33.64 
15 6.7 44.89 
16 54 29.16 
17 5.0 25.00 
18 5.8 33.64 
19 6.2 ` 38:44 
20 $6 31.36 
21 5.7 ` 32.49 
22 5.9 34.81 
23 5.4 29.16 
24 5.1 26.01 
25 5.7 32.49 
z = 140.00 = 794,52 

r= 20 2.5.60 


"A A 1794. 52 — 25(5. $01 = 0.44 


The calculations for X and s? can be performed conveniently i in tabular form as 
illustrated in Table E5.1. : 


EXAMPLE 5.2 


Data for fatigue life of 75 S-T aluminum yield the histogram of Fig. 1.5. It is 
suggested that a log-normal distribution will fit the shape of the histogram well. 


. Estimate the parameters 4 and ¢ of the log- -normal distribution. 


246 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


. The sample mean and sample variance are computéd to be 
X = 26.75 million cycles 
s? — 360.0 (million cycles)? 
According to the relationships in Table 5.1, the mean and variance of a log- 
normal distribution are given by 
E(X) = exp (4 + 42) 
Var(X) = EX — 1) 
Hence, the estimates of the parameters 4 and £, denoted as 4 and 7, are obtained as 
the solutions to the following equations. 
exp (A -- 322) = 3 = 26.75 
(26.75)%(et* — 1) = s? = 360.0 
Thus, À = 3.08 and È = 0.64. 


The method of maximum likelihood. A method of point estimation 


that is popular among statisticians is the maximum. likelihood method. In 
contrast to the method of moments, the maximum likelihood method pro- 
vides a procedure for deriving the point estimator of the parameter directly. 
Consider a random variable X with density function f(x; 0), in which 8 is 
` the parameter, such as the mean A in the exponential distribution. On the 
basis of the sample values zı, +++, £n, one may inquire: “what is the most 
likely value of 0 that produces the set of observations zi +++, t,?” In 
other words, among the possible values of 0, what is the valüe that will 
maximize the likelihood of obtaining the set of observations? Such is the 
rationale underlying the maximum likelihood method of point estimation. 
The likelihood of obtaining a particular sample value x; can be assumed 
to be proportional to the value of the probability density function evaluated 
at x; Then, assuming random sampling, the likelihood of obtaining n 
independent observations 21, +++, x, is 


L (a1, ++, 2.58) = f (213 0) f(250) +++ f(25;0) (5.4) 
which is the likelihood function of observing the set zı, +++, z,. The mazi- 
mum likelihood estimator 6 is then the value of 0 that maximizes the likeli- 
hood function L(zi, -+*, £a; @). This estimator may be obtained by differ- 
entiating L(x, ++ 
equal to zero, giving usually an absolute maximum (Hoel, 1962) ; that-is, 
6 is obtained as the solution to the following equation. : 


OL (21, +++, Znj 8) 
E 


Because of the multiplicative nature of the likelihood function, it is fre- 
quently more convenient to maximize the logarithm. of. the likelihood 


+, $450) with respect to 0 and setting the derivative . 


=0 20065) 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 229 


function instead; that is, 


ð log L(a1, +++, 24; 0) 


En -0 (5.6) 


The solution for ó from Eq. 5.6 should be the. same as that obtained with 


Eq. 5.5. "- 
For density functions with two: or more parameters, the likelihood 


function becomes 


"s Om) = ILrosa, ty Om) (5.7) 


i=l 


D(a, +++, 2850s * 


where fı, «++, Om are the m parameters to be estimated. In this case, the 
maximum likelihood estimators would be obtained from the solution to the 


following set of simultaneous equations. 
OL (ay, +++, Enj Or, * 55,08) 
86; 


-0 jeLesym (58) 


The maximum likelihood estimate (MLE) of &- parameter possesses 
many of the desirable properties of an estimator mentioned earlier. In par- 
ticular, for large sample size n, the maximum likelihood. estimator is often 
considered the "best" estimate, in that it has the minimum variance 
(asymptotically) (Hoel, 1962). : 
EXAMPLE 5.3 


The times bétween successive arrivals of vehicles in a traffic flow were observed as 


follows. 
1:2, 3.0, 6.3, 10.1, 5.2, 2.4, 7.1 sec 


Suppose the interarrival time of vehicles follows an exponential distribution; that is, 
1 
fet) =; gt 
Determine the maximum likelihood estimate (MLE) for the mean interarrival - 


time 4. 
From Eq. 5, 4, the likelihood function of the seven observed values is 


Lty, + +55 t73 »- 


= (y exp’ (3 zu) 


ht zor C742) 


7 
© where t; is the ith observed interarrival time and > = 2, + Then, according to Eq. 5.5, 
wel . 


2L = -1 op (= im) + 477 exp [3 zm -0. 


230 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


e + z] exp (5 2) =0 


7 
P 
A= E = 5.04 sec 


or 


From which we obtain 


In general, therefore the MLE for 4 froma sample of size n is 


n 


EXAMPLE 5.4 


A triaxial specimen of saturated sand is subjected to cyclic vertical loads with a 
stress amplitude of +200 psf in a laboratory test. The number of load cycles applied 
until the sand specimen fails has been recorded for five independent specimens as 
follows. 

25, 20, 28, 33, 26 cycles 


Suppose the number of load cycles to failure for the sand is assumed to follow a log- 
Saver distribution; estimate the parameters 4 and ¢ by the maximum likelihood 
method. 


Let us first derive the general expressions of the maximum likelihood estimators 4 


and ¢ for a log-normal distribution. From Eq. 5.7, the likelihood function is given by 


hades (aera | ^ lfinx; — A\2 
LG... Xni A D = l = [ 5{— f 
1 I Visix P 2 Dok 


I M5 LN 
E (zz j ne 3) a| -m X nx, =a] 


i=l 


The presence of exponentials suggests that it is more convenient to work with the 
logarithmic form; thus . ` 


In L(xj, 2.2 x4; 4, O = —nInv2a — -5 -ES 
i Kk b . d m ENS 2 " Wee 22 2 Mx pP ip 
To maximize the likelihood function, we have Ep 
ðlnL 1 3 
za ni mu: —4) =0 
aink n I i 
fcu MP LATE EE TE 
3t ? + s X(In x; — 4) =0 
Since. { 0, the solution to these two simultaneous equations yields 
n 
> Inx 
À zii 
n 


and 


—— e AT 


5.3. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 231 


Substituting the values of the observed data, we obtain the MLE. of the param- 
eters as 


= i(In25 + In20 + In 28 + In 33 + In 26) = 3.26 

= 3n 25 — 3.26? + (In 20 — 3.26)? + (In 28 — 3:26* 
+ (In 33 — 3.26)? + (In 26 — 3.26)7] = 0.027 

5.2.2. Interval estimation of the mean . 
How good is the estimator X? So far, we have discussed the point 
estimates of the mean and variance; such estimates, however, do not convey 
information on the degree of accuracy of these estimates. For this reason, 
the interval over which a parameter may lie often is used to supplement 
the point estimate (a single number) of the same parameter. Such intervals 
are called the confidence intervals, and the method of estimation is known 
as interval estimation. ae IUE 

Since we are using the sample mean to estimate the population mean 
u, the accuracy of this estimate is naturally of concern. We examine this as 


follows. x i 
. First of all, for a random sample of size n, the values a, %, ***, En 
can be conceived to be the respective sample values of a set of independent 
random variables X; Xə, +++, Xa Moreover, in random sampling, the 
density functions of Xi, +++, X, are individually the same as that of the 
population X; that is, ; 

fum) = fum) = +++ = fx) fx() 


Then the sample mean is also à random variable 
oe Ue i 
X--XX (5.9) 
Ln 
Its expected value is 


EÈ) = «(i Ex) - 


Ln 


m. 


. Hence 


EGD =: un ' (8.10) 


Li 
UE 


The expected value of the sample mean X, therefore, is equal to the 
population mean; in view of this, X is said to be an “unbiased” estimator of 
the population mean p. Y : 

Since X is à random variable, it also has a variance: 


E = Nod, dea c 
Var(X) = ve(2 »ES - L va( ix) 


i=l à 


232 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


» 


n 


Figure 5.3 Distribution of sample mean 


s 


Du Xs, Xs, +++, X, are statistically independent (in random sampling) ; 
ence MES : 


> Var(X;) = E (no?) = = (5.11) 


i=l’ 


Var(X) = 2 

nt 

Therefore, according to Eqs. 5.10 and 5.11, the sample mean X has a 

. mean value » and standard deviation o/+/n. These results apply so long 

as the X?s are statistically independent and identically distributed’ as 

X; that is, if random sampling is assumed. In practice, of course, it is dif- 

ficult to verify whether the assumptions of random sampling are satisfied; 

it is important, however, that every effort be made to ensure that samples 

: taken from a population are sufficiently random to permit the use of the 
results derived above. ` 


In Chapter 4 we saw that the sum of n independent normal variates is . 


also a normal variate. Hence, if the underlying population X is Gaussian, 
the estimator X is also Gaussian. Moreover, if the sample size n is suf- * 
$ ficiently large, the sample mean X will be approximately Gaussian (by 
. virtue.of the central limit theorem), even if the underlying population is not 
Gaussian. 

Therefore, for large sample size n, we can generally assume that X has a 
normal distribution N (n, ¢/+/n). As the sample size n increases, the dis- 
tribution of X becomes narrower as illustrated in Fig. 5.3, indicating that 
the quality of the estimate Z improves with the sample size ». In other 
words, as n increases the sample.mean Z is more likely to be closer to the 
population mean y. In the extreme case, as n — ©, £ — p. 


Confidence interval with known variance. Consider first the case 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 235 


in which there is prior knowledge of the variance or standard deviation of 


:the population, and only the mean value is to be estimated. This condition 


is sometimes encountered; for example, in electronic distance measurement, 
the standard error is fairly constant for a given type of equipment; in such 
cases, therefore, o may be assumed to be known from previous experience. 
We have just seen that for large sample size, the sample mean X can be ` 
described with the normal distribution N (a, ¢/+/n). By a simple trans- 
formation (see Example 4.1), it can be shown that (X — 4)/(c/ V/n) isa 
standard normal variate. Hence the probability that (X — u)/(c/4/n) 
will be in a given interval, for example, between +1.96, is given by 


P(-1ss gd ud < 1.96) = 6(1.96) — 6(—1.96) = 0.95 
of/Vn M 


i 

For the concrete data in Example 5.1, if the standard deviation o is 
known (for example, through years of experience and testing) to be equal 
to 0.65 ksi, the preceding statement becomes 


X-nu | : 
— 196 < ~*~ «199| = 0: 
z|; 196 < 5 sas = 190 95 


or 
P[—0.205 < X — u € 0.255] = 0.95 . 


Physically, this statement implies that before obtaining the test results, 
it is. expected that the sample mean X will lie within 0.255 ksi of the . 
actual mean y with 95% probability. After the test results are obtained, 
which give Z = 5.6 (from Example 5.1), the equation above yields 


„P [-0.255 < 5.6 — & € 0.255] = 0.95 


or " 
P [5.6 — 0.255 € p < 5.6 + 0.255] = 0.95 


Thus 
P [5.845 < p < 5.855] = 0.95 


This.appears to imply that “the mean value y of the crushing strength of 
the concrete lies between 5.345 ksi and 5.855 ksi with probability 0.95.” 

Strictly speaking, however, this implication is not correct. In the first 
place, in the classical approach the population mean p is a constant, not a 
random variable. Moreover, if another 25 concrete specimens were tested, 
the probability. that »-will lie in the same interval may well be different. 

Alternatively, if different sets of observed data were used.to construct 
similar 95% probability intervals, wé may say that on the average 95% 


234 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


Density 


Areaza/2 


X-H 
olda 


Figure 5.4 Density function of (X — u)/(e//n) 


v2 


of these intervals will contairi the population-mean y. Hence, ina general ` 


case, if we denote (1 — o) as the specified confidence level, and # kaj as 
the values of the standard normal variate with cumulative probability 
levels-a/2 and (1 — a/2), respectively, as shown in Fig. 5.4, we may write 


: X-nu i 
P (2. « Usu < ho) = l—e (5.12) 


* Upon rearrangement and substitution of the observed sample mean Z, 


Eq. -5.12 becomes 


g 


P(t — kan Se <u S84 Ia Se) = 1a 


More properly, then, the interval estimated on the basis of a single sample 

of size n should be interpreted as follows: “There is a confidence of (1 ~ a) 

that the estimated interval contains the unknown jg." Thus such an in- 

ae is called the (1 — a) confidence interval for the mean p, and is given 
y i ; | 

n 


MOVE (s - ben Fe $ + kan T) l : ` (513) 


i] should be emphasized that the confidence intervals so obtained would 
be exact for normal populations with known standard deviations. How- 
ever, for nonnormal populations, confidence intervals of Eq. 5.13 are only 


` approximate; the accuracy of the approximation, however, will inerease 


with the sample size m. 
The following steps summarize the general procedure for establishing the 


‘EXAMPLE 5.5 


5.3. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 35 


confidence interval of the mean » when there is prior knowledge of the 

standard deviation c. . i 

l. Choose the confidence level (1 — o). 

2. Determine the value ka from a table of normal probability (for 
example, Table A.1 in Appendix A); specifically, 


ae) 


8. Apply Eq. 5.18 using the sample mean 4 estimated from the observed 
sample of size n, obtaining the (1 — «) confidence interval for the 
mean d 


he = (2 - Fi ka Z+ A tan) 


The daily dissolved oxygen (DO) concentration for a stream at a station has been 
recorded for 30 days. The daily level of DO concentration is known to vary with a 
standard deviation of o = 4.2 mg/l. From the sample of 30 observations, the sample 
mean is calculated to be X = 2.52 mg/l. Determine the 99% confidence interval for 
the mean daily DO concentration: 25 

Following the steps outlined above, we obtain 


(2)1—«—099 | or « —1 —099 = 0.01 
(b) from Table A.1, Kay = 0-1(0.995) = 2.58 


a v42 
Z kaja = ae 2.58 = 0.965 
© um eh = 756 


The 99% confidence interval for x, therefore, is (2.52. — 0.965; 2.52 + 0.965) or 
(1,56; 3.49) mg/l. J i , 

Similarly, we obtain the 95% confidence interval as follows: 

Kk oos = 070.975) = 1.96 ` 
and, E 
S k eas = 0.733 

‘Hence, the 95% confidence interval is (1.79; 3.25) mg/l. S 

"Therefore, if the distribution of DO concentration is Gaussian, these results 
would’be the exact 99% and 95% confidence intervals. If the underlying distribution 
is not Gaussian (or is unknown), the results obtained above would be approximate 
confidence intervals. 


Comparing the 95% and 99% confidence intervals computed in Example 
5.5, we observe that the 99% confidence intervalis larger than that at the 


236 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


qoa 


Figure 5.5 Student i-distribution 


95% confidence level. This is reasonable because a larger interval is more 
likely to contain » than a smaller one. 

Furthermore, we observe that the confidence interval for the mean ‘de- 
pends on the standard deviation ø and on the sample size n. From Eq. 
5.18, it is clear that as ø decreases or as n increases, the confidence interval 
becomes narrower for the same confidence level (1 — a). This means that 
smaller population variance or larger sample size would increase the 
accuracy of the sample mean as the estimator of the population mean. 


In general, the value 
of c is not known and must be estimated using Eq. 5.3. The foregoing 
procedure for determining the confidence interval for u may still be.used 

-when the sample size n is large. That is, when n is large (for instance, 


> 20), the sample variance s* is a good estimator of the population variance. 


a? (see Section 5.2.4). Consequently, in such cases, using s fore in Eq. 
5.13; we obtain the confidence interval as if « is known. It should be 
emphasized, however, that the confidence intervals so obtained will be very 
approximate if n is small (for example, « 10). 

When there'is no prior knowledge of the population variance, an exact 
confidence interval for u can be determined if the underlying population 


is Gaussian. In this case, the probability distribution of (X — )/G/A/n) 


is required. This can be shown (for example, Freund, 1962) to have the 
i-distribution (or the with (n — 1) degrees of 


freedom, whose density function is 


3 N — (1/2) FR) l 


22. ULADDLUAL ADD INUAGEE AU LDA IEE RU UA A AINSANA AAA RUI “vs 


where f is the degree of freedom. A family of t-distributions with various 
values of f is shown in Fig. 5.5. It may be observed that the t-distribution 


‘has a bell-shape density function similar to the normal curve and is sym- 


metrical about the origin. For small values of f the density function of the 
i-distribution is flatter than the standard normal distribution; however, as 
f increases, it tends toward the standard normal distribution, as illustrated 
in Fig. 5.5. 

On this basis, therefore, we can form the following probability statement 
for the random variable (€ — y)/(S/V/n): 


(5.15) 


^ "where 1,j2,5-1 has a similar interprétation as kaj, of Eq. 5.12. In the present 


case, of course, tanı denotes the precentile value of the t-distribution 

with (n — 1), degrees of freedom. In general, £,5,; is the value of the 
variate T at the cumulative probability (1 —. 2/2), as shown in Fig. 5.6. 
Values of £45; are tabulated for various probability levels p-(1 — o/2), with: 
different degrees of freedom in 2 Rearranging the terms in Eq. 
5.15, the exact confidence interval for ile: mean (of Gaussian population), 
therefore, is i - 


(tine =. |z -" laf ni Vi ít + laii zx (5.16) 


Unshaded Area = l-a 


Figure 5.6 (1 — a) Confidence intervalin i-distribution. 


ave GUI) X AINAANI AO IE ALO PAU UDOEAAF ALIUINAL JALA 


where Z and s are the sample mean and sample standard deviation, n is the 
sample size, and (1 — a) is the specified confidence level. 


EXAMPLE 5.6 


Suppose that the 30 observations of daily DO concentration presented in Example 
5.5 give a sample mean X = 2.52 mg/l and sample standard deviation s = 4.2 mg/l. 
Determine the confidence interval for the population mean g. oo 


The number of data points is 30; hence, (X — 1)/(S] Vn) has a t-distribution with 
f =" — 1 — 29 degrees of freedom. If a 99% confidence level is desired, « = 
1 — 0.99 = 0.01; and («/2) = 0.005. From Table A.2, with f = 29, we obtain under 
the column for (i — «/2) = 0.995, 


faj2,29 = f 005,29 = 2.756 


Hence the 99% confidence interval forthe mean daily DO concentration is 


l v vi 
= (2.52 — 2.756 ———; 2.52 + 2.756 —=] = (1.49; 3.55) mg/l 
tH). -( 67 7a) =e ) mgl 


This is a larger interval than that obtained in Example 5.5 of (1.56,.3.49) mg/l, which 
was obtained assuming that the standard deviation is known. This is to be expected, 
since not knowing the value of c introduces additional uncertainty; hence, to 
maintain the same confidence level, a wider interval is required than if ø is known. 


One-sided' confidence limit for the mean. The confidence interval 
established above is called a two-sided confidence interval, because it in- 
cludes the upper and lower limits that bound the value of the population 
mean p. There are instances, in practice, in which only.the lower limit 
or the upper limit is pertinent. In such cases, we would be interested in 
the one-sided confidence limit for the mean u. For example, in the case of 
material strength, or capacity of a highway or of a flood channel, the lower 


. limit of the mean y will*be of engineering interest. 


For such purposes, the (1 — o) lower confidence limit, denoted <. Bises 
means that the population mean y will be larger than this limit with a 
confidence level of (1 — a). Assuming prior knowledge of c, such a limit 
is obtained by forming the following probability statement for the standard 
normal variate (€ — u)/(c/4/n): . 


i-a 
P Ske] =1 ~ . 
G vn te) now (510) 
where 1 — a is the specified confidence level, and ka = @"(1 — a). Re- 
arranging the terms in Eq. 5.17, we have - 


P(u2X-k—-)-1i-— 27 Gd] 
ams 2] l-a (5.174) 


5.2. CLASSICAL APPROACH TO ESTIMATION UT. F'ATLAUILE maw -—-- 


Hence the (1 = a) lower confidence limit for the mean y is 


< Hime » (s — ke <x) . (5.18) 


_ EXAMPLE 5.7 


à i -di 1 show 

Test results for 100 randomly selected specimens of 1-cm: diameter A36 stee! 
that the sample mean and sample standard deviation of the yield strength are, 
respectively, x = 2200 kgf (kilogram force) and s — 220 kgf. For specification 
purposes, the manufacturer is required to specify the 95 % lower confidence limit 
of the mean yield strength. Because of the large sample size (n = 100), assume that 
c is satisfactorily given by s, which is 220 kgf. 

With 
: i 1 — œ = 0.95, a= 0.05 

nd : 
= ks =B1(0.95) = 1.65 


The lower confidence limit, therefore, is 
220 
v1 


o 


| 


Z —d,—- —2200 — 1.65 


& 
| 


2 


=2164kgf ^ 


In other words, it is 95% confident that the mean yield strength will be at least 
2164 kgf. : 


` Conversely, there are situations in which the upper confidence limits are 
required. For example, in determining the wind load on a structure, we 
would like to state with a high degree of confidence that the mean wind 
load will not exceed a certain limit. In such a case, the upper confidence 
limit‘on wis desired. Following the same procedure as that.of Eq. 5.17, it 


can be shown that the (1 — a) upper confidence limit is 


ERES (s + ka z) E ` (5.19) ; 


If the sample size n is small, ‘and the population standard deviation is 
not known, the i-distribution should be used to determine the correspond- 
ing upper and lower confidence limits. On this basis, the appropriate con- 
fidence limits are as follows: 

(1 — a) lower confidence limit 


ird 3 : 
€ i)i-e = E — las Wr (5.20) 
(1 — a) upper confidence limit © , : 
- 7 $ ^ ig 8 : ^ 
c (u >ra = B+ toms 7 (5.21) 


p 


ero 4324 AITAZAA RIV E ARAM DAES FILUM UBSERVALIUNAL DATA - 


It should be emphasized that the confidence intervals given in Eqs. 
5.18 and 5.16, and the one-sided confidence limits of Eqs. 5.18 through 
5.21, for the population mean u are exact if the underlying population is 
Gaussian. However, for practical purposes, these results are applicable 
to non-Gaussian populations if the sample size is at least moderately large 
(for example, n > 10); for this reason, the preceding equations may be 
used to determine the (approximate) confidence intervals and limits of » 
irrespective of the distribution of the underlying population. 


EXAMPLE 5.8 


Table E5.8 shows data for storms and associated runoffs on the Monocacy River 
at Jug Bridge, Maryland (data from Linsley and Franzini, 1964). 

(a) Compute the sample mean and sample standard deviation for the precipitation 
and runoff, based on the data given in Table E5.8. 
. . (b) Using the sample variance in place of the corresponding population variance, 

determine the 99.9% confidence interval for the mean precipitation. Also determine 
the corresponding 99.9% upper confidence limit. 


Table E5.8. Precipitation and Runoff Data 


. Storm no. Precipitation (in.) Runoff (in.) 
1 1.11 0.52 
2 1.17 0.40 
3. 1.79 0.97 
4 5.62 —— 2.92 
5 1.13 0.17 
6 1.54 s 0.19 
7 3.19 0.76 
^8. _ 1.73 : 0.66 
9 2.09 0.78 
10 2.75 . 1.24 
11^ 1.20 - 0.39 
12. 1.01 0.30 
13 1.64 0.70 
: 14 1.57 0.77 
15 1.54 0.59 
- 16 2.09 $ 0.95 
17 3.54 s | 4,02 - 
18 147 0.39 
19 115 0.23 
20 2.57 0.45 
21 3.57 1.59 
22 5.11 1.74 $ 
23 1.52 0.56 
24 2.93 1.12 
25 1.16 0.64 


5,2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 


Solution 


(a) Let x — precipitation (in inches); and y. runoff (in inches). 


X x? Ji yê 
1.11 1.23 0.52 0.27 
1.17 1.37 0.40 0.16 
1.79 3.20 0.97 0.94 
5.62 31.58 2.92 8.53 
1.13 1.28 0.17 0.03 

1.54 2.37 0.19 0.04 
3.19 10.18 0.76 0.58 
1.73 2.99 0.66 6.44 
2.09 4.37 0.78 0.61 
2.75 7.56 1.24 1.54 
1.20 1.44 0.39 0.15 
1.01 1.02 0.30 0.09 
1.64 2.69 0.70 0.49 
1.57 2.46 0.77 0.59 
1.54 2.37 0.59 0.35 
2.09 4.37 0.95 0.90 
3.54 12.53 1.02 1.04 
1.17 137, 0.39 0.15 
1.15 1.32 0.23 0.05 
2.57 6.60 0.45 0.20 
3.57 12.74 1.59 2.53 
5.11 26.11 1.74 3.03 
1.52. 231 0.56 0.31 
2.93 8.58 1.12 125 
1.16 1.35 0.64 0.41 
53.89 153.39 20.05 24.68 

Thus 
x- d — 2.16 in. 


Sz = 124 in 
y- A = 0.80 


sf = A [24.68 — 25(0.80)"] = 0.36 - 
sy = 0.60 in. i 


241 


242 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


_ (b) With x = 2. 16, and assuming Ox SS ET. 24, the standard deviation of X is 


Although the precipitation is not Gaussian (see Example 6.3), Y may be assumed to 


be approximately Gaussian since the sample size is relatively large (2 = 25). Hence, 


according to Eq. 5.12, the 99.9% confidence interval for the mean precipitation is 


4 o " c 
(4x).809 = (= = Fans a53 X K oo 5) 


= [2.16 — 3.29(0.25); 2.16°+ 3.29(0.25)] 
= (1.34 in., 2.98 in.) 


whereas the one-sided upper 99.995 confidence limit'on the mean.precipitation is. 


‘ c Ky 
— + kw — = 2.16 + 3.09 x 0.25 = 2.93 in. 
(#x).999 = X + ko v3 6 + ^ i m. 


EXAMPLE 5.9 


(a) In a traffic survey where speeds of vehicles are measured, it is desired to 
determine the mean vehicle speed to within +1 kph-(kilometer per hour) with 99% 
confidence. From a preliminary study, the standard deviation of the vehicle speed is 


found to be 3.58 kph. Assume that all observations are independent; determine the. 


number of observations required. 

(b) 1f 150 observations were taken, what would be the confidence level associated 
with the interval of +1 kph of the mean speed? Assume that the standard deviation 
of vehicle speed is still 3.58 kph. 


. Solution 


'*(a) Let n be the number of" observations required. The confidence interval is 
given. by Xx kanla Vn), where o is the known standard deviation. For a 99% 
confidence level, « — 0.01 and a = k os = 2.58. Therefore, setting ; 


3.58 , 
= 2.58 x EE! 
al y "- E 


, the number of observations s required is 
n = (2.58 x 3.58} = 


(b) If 150 observations had' been taken and the same confidence interval were 
desired, we would expect the confidence level to increase. In other words, the value. 
ofa would decrease. Setting 


k 


" 3.58 is 
^» 150 
we obtain 
v150 : 
kam —> 343. 
aj? 3.58 i 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 243 ` 


and 
=1> 063.43) = 1 — 0.99969 = 0.00031 


rs 


The confidence level, therefore, is 
1 — a = I — 0.00062 = 0.99938 or 99.938% 


- 5.2.3. Problems of measurement theory 


One of the major applications of point and interval estimation is in the 
theory of measurements (Parratt, 1961; Barry, 1964). Problems involving 
measurements require estimation of a fixed (but unknown) quantity, which 
is therefore analogous to the estimation of the unknown population mean 
a 

In measuring, for instance, a distance à, several (for example n) measure- 
ments may be taken constituting a sample of size n. The object then is to 
estimate the actual distance à from the sample measurements* dı, ds, - - *, da. 
Point and interval estimations then may be used to estimate this true 
distance (not its mean value). In this regard, à is analogous to p; hence 


d = 186,76 
$*194 


184 


Figure 5.7 Histogram of measured distance (after Bachmann, 1973). 

* Observed measurements will, in general, contain two types of measurement errors, 
namely, random errors and systematic errors (Parratt, 1961; Barry, 1964). It is assumed 
here that the sample measurements have been adjusted for systematic errors. 


244 ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 
the methods developed for estimating the mean u (which is a constant) 


can be used to estimate the -distance à (also a constant). In particular, 
the point estimate of 6 is 


z Èa (5.22) 


Spe 


In other words, a series of measurements di, d», «++, da are presumed to 
be the sample values of the independent random variables Dy, Dz, +--+, Da 
representing the populations of possible measurements, so that the point 
estimator of ô is 


pet dD; - (5.28) 
N. iml 3 . 
with expected value 5 ] 
' E(D) =a | A (5.24) 
and À ; 
Var(D) t (5.25) 
"where 


1 $ > 
2 = ——_ CS 2 


In measurement theory, the standard deviation of D, that is, V Var D œ~ 


(s/+/n), is known as+the standard error. 

Implicit in Eqs. 5.22 through 5.25 are also the assumptions of random 
sampling; namely, in this case, that Dı, Ds, ---, D, are statistically inde- 
pendent and are identically. distributed or-fp, = fp, = +++ = fon. More- 
over, these distributions are invariably assumed to be normal, as supported 
by observations (see, for example, Fig. 5.7). . 

It follows then that the variate (D — 8)/(S/+/n) has a i-distribution 
with (n — 1) degrees of freedom; hence, the basis for the confidence inter- 
val for à is’ - 


D-3 
P (7s « Svan s ter) =l-a l 


and thus the (1 — «) confidence interval for 6 is 


Gja = (a — lanes A Pod net 3) (5.26) 


When a function of one or more distances (or geometric dimensions) 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS : 5 : 


is involved, the value of the function is usually estimated on the basis of 
the mean dir ia distances. That is, if a function of several distances 
h, b, yd is 
f= Zh, h, aaa} b) 

in which h, p vit, ly are estimated, respectively, by the mean measure- 
ments h, l», +++, 1, then the point estimator of ¢ (using the approximation 
of Eq. 4.48) is 

ax 2 (La, L, vee, Lj) (5.27) 
where Li is the estimator of 1; in accordance with Ea. 5.23. The estimator 
Z, therefore, i is also à random variable with 


E(Z) e ZLE(Qa), E), +++, B] =t (5.28) 


and, in view of the errors in. L;, the standard error in Z, assuming inde- 
pendent Ly Ly «+, Ey, therefore, is obtained (applying Eq. 4.44) from. 


z t (ak? , $3. 
Var(Z) c » GE.) ot. (5.29) 
A i ZA 


, which is known as the “propagation of errors” in measurement theory. 


Thence, assuming Z to be Gaussian* with mean ¢ and standard deviation. 


(cg = V Var (Z), we obtain the confidence interval for t as follows: . 


pian i 
P (hen < aes < ha) =l-e 
Sz 
Thus the a — a) ‘confidence interval is 


Ghee = G— han oz; Z+ kagaz) > (5.30) 


where 2 = Z(h, b, -++, lh). 


To clarify these, consider the following examples from surveying. 


EXAMPLE 5.10 


The straight-line distance between two geodetic stations A and B is measured with 
an electronic ranging instrument called a tellerometer. The following are ten 


* This assumption would be consistent with the first-order approximation if Eq. 5.27 is 
linear; however, for nonlinear functions, this assumption would not be valid. The es- - 


mators Ly, Lo, . . . , Lx ate approximately C Gaussian by virtue of the central limit theorem; 
hence Z will not be Gaussian unless Eq. 5.27 is linear. 


f ij ; 


3246. ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


independent measurements of the distance: 


: (1) 454794 m (6) 45479.2 m 
(2) 45479.6 m ! (7) 45479.6 m 
(3) 45479.33 m.: (8) 45479.5 m 
(4) 45479.5 m (9) 45479.3 m 


(5) 45479.8 m l (10) 45479.1 m 


(a) Estimate the true distance ó. 

(b) Compute the standard deviation of the measured distances. 

(©) Determine the standard error of the estimated distance. 

(d) Determine the 2-sided 90% confidence interval of the actual distance 6. 


Solution 
(a) Estimated distance, 
d-— i; (45479. 4+ 45479. 6 +++ + 45479.1) 
- d 479.43 m 
9 Variarice of the measured distances, 


= &((45479.4 — 45479.43}? + (45479.6 — 45479. po 
oe 4549.1 — 45479.43)% 
= i(0401) = 0. 0445 m? 


Hence the standard deviation is s = 0. 2 m. 
(c) According to Eq. 5.25, the standard error of the estimated distance is 


(à With f =n — 1 29, and « =0.10; fos = 1. 8331 from Table A.2. Then 
using d — 45479.43 m, and s — 0.21 m, we obtain the 9095 confidence interval for ó, 


10, v10, 
= (45479.31 ; 45479.55) m 
EXAMPLE 5.11 


The area of a rectangular tr tract t of land is being considered. The sides of the 


Oja = [4547 43 — 1.833 s a 45479.43 + 1. 833 (S5) | 


rectangle are measured several times, with associated statistics summarized as, 


follows (see: also Fig. E5.11). 


No. of independent . Mean P 

Length | measurements measurement Sample variance 
D o 9 60 m 0.81 m? 
B H 4 " 70m T: 0.64 m? 
C 4 30m 0.32 m? 


Determine fhe 9595 confidence. interval of the actual area of the tract. . 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 247° 


Figure E3.11 


Solution 
In this case, the area is 
=(B+O)D 
According to Eq. 5.27, the area is estimated by 
A=(B+O)D 


Substituting the mean measured distances, the estimated area therefore is 
A = (10 + 30)60 = 6,000.m* 
The standard error of the estimator A. , according to Eq. 5.29, is 
oj? = Dog? + Dog? + (B + Coy? 


sg? sc, sp? 
e») F e» ) ae( >) 


= 3600(0.16) + 3600(0.08) -+ 10,000(0.09) 
= 1764 mê 


Thus 
og =42 m? 


Finally, using Eq. 5.30, we obtain the 95% confidence interval for ihe a area A as 
follows: - 


kgs = 1.96 
P(-196 COL a &156) ; = 035 


Thus the required confidence interval is 


(A).gs = [6000 — 1.96(42); - «6000 + + l. 96(42)] 
= (5916.9, 6083.1) m? 


E ESTIMATING PAKAMETERS FROM OBSERVATIONAL DATA 


. The problems illustrated in Examplés 5.10 and 5.11 are quite common 
in surveying, photogrammetry, and geodetic engineering. 


5.2.4. Interval estimation of the variance 


How good is the estimator $? Using an approach similar to the 
establishment of the confidence interval for the population mean, the 
confidence interval for the population variance o? may also be developed. 
For this purpose, we first observe that the sample variance (see Eq. 5.3) 


— X)? (5.81) 
is a random variable, with expected value 


Sal Eo] 


iml 


E(S*) = 


n 


yc 1 z[ È Maes n) E &- or] 


1 uw ss z 
-H| Èra- w- aea E 
n-— l ii M 5 k 
but : 
E(X; — p) = 
and it can be shown that 
2 
E(X-a-- 
Hence 
ms) h| iani] (5.32) 
iel y 


For this. reason, Èa., 5.81 is an unbiased estimator of o’, as we asserted 


earlier in Eq. 5.8. 
- The variance of S" is given (Hald, 1952) by 


` Var(8) = Enc a 2) | ij 


[A n—1 


where m = E(X —. ys is the fourth central moment of the population 
` random variable X. It may be observed that as n increases, the variance: 
of S? decreases. 


l Confidence interval for 9. For large n, the sample variance of Eq. 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS ` 249 


5.31 may be assumed, on the basis of the central limit; theorem? to-have a 
normal distribution with the mean and variance :of Eqs... 5.32 and 5.33. 
Then, at a confidence level of. (1 — a), 


S- -ø 
P (~ta vet: 


where. y, in Eq. 5.33 can be eyaliated using the fourth sample moment, or 


PE » (z; — nt i (538) 


< Ban) = =l-~a (5.34) 


Hence the (1 — e) confidence interval for the population variance (when 
nis large) may be obtained as 


Ce), a = [9 — han VIROS; S + han Var). (5.36) 
If the population is Gaussian, the variance of S? is (Freund, 1962) 


Var(8) = ( “a: (5.87) 


Tiene, $ the corresponding confidence interval, ‘becomes 


2e. g : 1 E st E yan 
OGRE rcc), 6 


EXAMPLE 5.12 


For the concrete strength data of Example 5.1, we had s? = 0.44 and n = 25. 
Assuming the strength of the concrete to, be a normal variate, we obtain the 95% 


: confidence interval for its variance et, approximately on the basis of Eq. 5. 38, ‘as’ 


follows: 
(o d 044. ^. — 044. Wy 
957 + 196V2]4' 1 — 1.96V2/24 
on (0.28; 1.01) 


Exact confidence limits of è for normal population. The approxi- 

mations given in Eqs. 5. 36 and 5.38 can be quite poor when n is small.-If. 

the population is normal, however, exact confidence limits can be obtained; ~. 

the basis for'such exact estimates is as follows. 
Rewriting Eq. 5.31, we have 


È LX; — 4). -(-wP 


fel 


e -10$8 


N 


3 bate adea 
i=1 


zou ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


fle) 


Figure 5.8 Chi-square-distribution with different f 


Dividing through by o°, we obtain : 
; (7 (a-DS g (gay (£-5j i 
Se YH (HS ENT (5.39) 


c? i e 


where X; are presumed to be normal, and thus X is also normal. Then the 
first term on' the right side of Eq. 5.39 is the sum of squares of ninde- 
pendent standard normal variates; it can be shown, by generalizing the 
result of Example 4.6, that this has a chi-square distribution with n 
degrees of freedom (to be-denoted x2). Similarly, the second term on the 
‘right side of Eq. 5.39 is also the square. of a standard normal variate and 
therefore has a chi-square distribution with one degree of freedom. More- 
over, it can be shown (Hoel, 1962) that the sum of two chi-square variates 


with p and q degrees of freedom is also a chi-square variate with (p + q). 


degrees. of freedom. On these bases, therefore, (n — 1)S?/o? of Eq. 5.39 
has a x2 , distribution; that is, a chi-square- distribution with (n.— 1) 
degrees of freedom. 

In general, the density function of the x distributis with f degrees of 
fréedom i is given by 


a Wee) = ————- cU — 1) g-efa c20 (5.40) 


zn J/2) 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 251 


Such a distribution is shown in Fig. 5.8 for different degrees-of-freedom f. 
As would be expected, by virtue of the central limit theorem, the x} 
distribution approaches the normal distribution as f — © ; this may be ob- 
served also in Fig. 5.8. Because f = n — 1, this gives us a basis (albeit 
crude) for determining the sample size n necessary to ensure reasonable. 
approximations of Eqs. 5.36 and 5.38. Visually, from Fig. 5.8, it appears 
that » > 25 may be sufficient sample size to permit the applications of 
Eqs. 5.36 and 5.38. s 

If the population is Gaussian, the upper confidence limit for the variance 
a% according to the foregoing chi-square distribution, is given by 


P [ens > com] =1>a (5.41) 


where Ca,n-ı denotes the value of the Xaa variate at the cumulative B prob: 
ability of o; that is, 


P(C < cen) =Q 


as illustrated graphically in Fig. 5.9. Values of cain are tabulated in 
Table A.3 of Appendix A for specified values.of a and n — 1 — f. 

"Then the exact pe — a) upper confidence limit for o? of a normal popu- 
lation is 

(n — 18g 
(e? >a = (n — Ds (5.42) 
Gui. Ay 

. Although two-sided confidence intervals: for o? may similarly be de- 
veloped, the one-sided (upper) confidence limit of Eq. 5.42 is more ac 
in the case of variances. : 
EXAMPLE 5.13 & 


For the DO data in Example 5.5, we had s? = 4.2 and n = 30. If a 95% upper 
confidence limit on o? is desired, then from Table A.3 (with « = 0.05) we obtain 


tote) 


Ca, n-i D 


Figure 5.9 Chi-square distribution with (n — 1) diem of freedom 


. 204 ESTIMATING PARAMETERS KUM UBSERVATIUINAL DATA 
Cs on-1 = C.95,29 = 17.708. Hence the (upper) confidence limit for the variance of DO 
is: Erao : ` 


29 x 4.2 
17.708 - 


5.2.5. Estimation, of proportion’ 


(0) 95 = = 6.89 (mg/l)? 


In many engineering problems requiring probabilistic formulations, 


the necessary. probability measures must be estimated on the basis of 
experimental .observations; for example, the probability of hurricane- 


intensity wind occurring in a year, the proportion of vehicular traffic © 


making: left turns at an intersection, or the proportion of embankment 
material meeting specified compaction standards. 

In such cases, the required probability may be estimated as is pro- 
portion of occurrences (of an event) in a Bernoulli sequence (see Section 
3.2.8). Suppose that we have a.sequence of n independent trials X, X. a °°", 
Xm, where every X; is a two-valued random variable; specifically, X; = 1 
or-0 denotes the occurrence or nonoccurrence of an event in the ith trial. 
Then the sequence. Xi, X», +++; X, constitutes a random sample of size n. 

The probability p of occurrence of an event in a trial is the parameter 


in the binomial distribution. The maximum likelihood estimator of this ` 


probability can be.shown to be 
fwd Sx (5.43) 


In other words, the estimate of is the proportion of occurrences among 
the séquence of-A trials. 

Confidence interval for p can also be developed as follows. Observe first 
that 


` E(P) = zG Ex) - DES ; 
_ but i 
E(X) =1(p) + OL — p) =p 
Hence 
B(P) = = (np) =p. (6.44) 
And | 


. Var e» = — L Èva) 


IE zu [Bap mn 


5.2. CLASSICAL APPROACH TO ESTIMATION OF PARAMETERS 253 
10 : 


0.9) 


0.8 


| ZA 
YZ 
We GZ 7 


Theoretical Proportion, p 


Tol adi 03 04 05 06 OT 08 09 “10 
Observed Proportion P. 


D 5.10 "m confidence interval of proportion (after Clopper and Pearson, 
1934 : : ` 


` where E(X}) = p. Thence, 


pil — p) 
* 


aS 1 7 
Var (P) = -n(p—-p)- (5.45) 
Therefore the estimator P is centered around P. with a variance that de- 
creases with the sample size n. 
For large n, P will be approximately, Gaussian by virtue of the central 
limit theorem; furthermore, ‘the variance of Eq. 5.45:may be; approximated. 
by 


Mae (Ê) = LS ir l (5.452) 


where # is the observed proportion’ AS from the sample data. 


| 254 -ESTIMATING PARAMETERS FROM OBSERVATIONAL DATA 


Then the confidence interval for p is obtained from 


BU — $)/n 
giving the (1 — «) confidence interval as 


ae = (s - hen JOD ; B+ a. x (5.47) 


"Figure 5.10 is a graph showing the 95% confidence interval for p as a 
function of the observed proportion p for different sample size n. 


P(t Pee ae eee ber) =l-a - (5.46) 


EXAMPLE 5.14 


In inspecting the quality. of soil compaction in a highway project, 10 out of 50 
specimens inspected" do not pass the CBR requirement. It is desired to estimate the 


' actual proportion p of embankment that will.be well compacted (that is, meet CBR 


`. requirement) and also establish a 95% confidence interval on p. 
The point estimate for p is given by Eq. 5.43 as 


40 
p= 5 = 08 


The corresponding 95% confidence interval is, according to Eq. 5.47, . 


= 0.8(1 — 0.8). B 0.8(1 — 0.8) 
(Pros = {0.8 — 1.96 [79V — 79^: 06.8 41,96 [LOU 7 V9) 
EUN | EO b ($9 


= (0.69; 0.91) 


| $... CONCLUDING REMARKS 


In modeling real-world situations, the form.of the probability distribu- 
tior of a random variable may be deduced theoretically on the basis 

` of physical considerations.or inferred empirically on the basis of observa- 
tional dàta. However, the parameters of the distribution or the main 
descriptors (mean and variance) of the random variable must necessarily 

: be related empirically to the real world; therefore, estimation based on 
factual data is required. Classical methods of parameter estimation are 

- presented in this chapter; in Chapter 6 empirical and inferential methods for 
determining, probability distributions are described. 

Classical methods of estimation. are of two types—point and interval 
estimations. The common methods of point estimation are the method of 
maximum likelihood and the method of ‘moments; the former derives the 
estimator directly; the latter evaluates a parameter by first estimating the 
moments (usually the mean and variance) of the variate through the cor- 
responding sample moments. Interval estimation includes a determination 


PROBLEMS 255 


of the interval that contains the parameter value with a prescribed level 


‘of confidence. 


It should be recognized that when population parameters are estimated 
on the basis of finite samples, errors of estimation are unavoidable. Within 
the classical methods of estimation, the significance of such errors are not 
reflected in the (point) estimates of the parameters; they can only be ex- 


. pressed in terms of appropriate confidence intervals. Explicit consider- 


ation. of these errors is embodied in the Bayesian approach to estimation, 


which is the subject, of Chapter 8. 


PROBLEMS 


5.1 In the measurement of daily dissolved oxygen (DO) concentrations in a 
‘stream, let p denote the probability that the DO concentration will fall below 
` the required standard on a single day. DO concentration is measured daily 
until unsatisfactory stream quality is encountered, and the number of days in 
this sequence of measurement is recorded. Suppose 10 sequences have been 
Observed and the length of cach sequence.is 
2, 5, 6, 4, 6, 6, 8, 5, 10, 1 (days) ih 
(a) Determine the maximum likelihood estimator for p, and estimate p on 
the basis of the observed data. ui . : 
(b) Estimate p' by the method of moments. (Hint. Use the relations in 
i Table 5.1). ` ` : 
5.2 For the concrete crushing strength data tabulated in Table E5.1 in Example 
5.1, determine the point estimates for x and a by the method of maximum 
likelihood. Assume that concrete strength follows a Gaussian distribution. 


5.3 The distribution of wave height has been suggested to follow a Rayleigh 


density function, UA 
fall) e were hao 
with parameter «, Suppose the following measurements on wave heights were 


recorded: 1.5, 2.8, 2.5, 3,2, 1.9, 4.1, 3.6, 2.6, 2.9, 2.3 m. 
Estimate the parameter « by the method of maximum likelihood. 


l 5.4 Data on rainfall intensities (in inches) collected between 1918 and 1946 for the. 


watershed of Esopus Creek, N.Y., are tabulated below as follows: 


1918—43.30 1925—43.93 1932—50.37 , 1939 -42.96 
1919—53.02 1926—46.77 1933—54.9] 1940—55.77 
1920—63.52 1927—59.12 1934—51.28 1941—41.31 
1921-—45.93 1928—54.49 1935—39.91 1942—58.83 
1922—48.26 1929—47.38 1936—53.29 1943—48.21 

. 1923—50.51 1930-—40.78 .1937—67.59 1944—44.67 
1924—49.57 1931—45.05 1938—58.71 1945—67.72 

ý 1946—43.11 


(a) Determine the point estimates for the mean p and variance c?. 
(b) Determine the 95% confidence interval for the mean 4. Assume the 
annual rainfall intensity is Gaussian, and o cx s. 


| vy | EOIIMAAAIIYG AKRAMETERS FKUM OBSERVATIONAL DATA 


5,5 Consider the annual maximum wind velocity (V) data given in Problem 3.25. 
(a) Calculate the sample mean and sample variance of V. E 
-(b) Determine the 999 confidence interval for the mean velocity. Assume 
that the true standard deviation of V, oy, is satisfactorily given by the 
. sample standard deviation s,. j : 
(c) Assume that V has a log-normal distribution; determine the point 
estimates for the corresponding parameters Ay and ty. 


5.6- A structure is designed to.rest on 100 piles. Nine test, piles were driven at 
random locations into the supporting soil stratum and loaded until failure 
occurred. Results are tabulated as follows. | 


$ Pile capacit 
Test pile tene) 7 


W000i OI 
- Y 0 
ao oo 
` 


/ 


(a) Estimate the mean and standard deviation of the indivi i i 
ri Used at the site, individual pile capacity 
(b) Establish the 98% confidence interval for the mean pile capacity 

assuming known o = s. i 


(c) Determine the 98% confidence interval for the mean pile capacity on the 
basis of unknown variance. s ET 


5:7 The daily dissolved oxygen concentration (DO) for a location A downstream 
: from an industrial plant has been recorded for 10 consecutive days; 


"Day DO (mgl) ` 


P 
w 


- 


T 


(a) Assume that the daily DO concentration has a-normal distribution 
N(x, 9); estimate the values of and c. ` i 


PRAVDE D enc 


(b) Determine the 95% confidence interval for the true mean x. 
(c) Determine the 95% lower confidence limit of x. 
_5.8 A river has the following record on the levels of floods that occurred each year 
between 1960 through 1970. ` i 


2 


Flood level (m) | 


Year (above mean flow) 
.1960 3.7 
1961 2.3 
1962 5.1, 3.5 
1963 5.2 
1964 “4.7, 61, 5.2 
1965 3.4, 7.2, 1.5 
1966 — 1.5, 3.6 
1967 5.2, 1.4 
1968 .13, 45 
1969 34 
1970 . 44, 24 


(a) Draw the histogram of flood levels at 1-m interval. 

(b) Draw the histogram of the annual maximum flood levels at 1-m interval. 

(c) Based on the histogram, what is the return period for a 7-m fiood? 

-(d) Compute sample mean and sample variance of the annual maximum’ 
flood. ; : i : 

(c) Establish the 99% 2-sided confidence interval for the. mean annual 
maximum flood. 

(f) Assume that the annual maximum flood level has.a log-normal distribu- 
tion with the mean and variance computed in part (d); on this basis, 
determine the return period for a 7-m flood of this river. 

5.9 Froma set of data on the daily BOD level at a certain station for 30 days, the 
following have been computed: E : ` 


ž = 3.5 (mg/l) 
s? = 0.184 (mg/l? 


Assume that the daily BOD level is.a Gaussian variable. A 
(a) Estimate the mean and standard deviation of the BOD level. 
(b) Determine the 99.5% confidence interval for the mean BOD. : 
(c) If the engineer is not satisfied with the width of the confidence interval 
established in part (6), and would like to reduce this interval by 10%, 
keeping the 99.5% confiderice level, how many additional daily ineasure- 
ments have to be gathered? Ans. 7. 


5.10 ‘Suppose that a sample of 9 steel reinforcing bars were tested for yield strength, 
and the sample mean was found to be 20 kips. OE: 

(a) What is the 90% confidence interval for the population. mean, if the 

standard deviation is assumed to be equal to 3 kips?  . ] 

(b) How many additional bars must be tested to increase the confidence of 


` thé same interval to 95%? Ans. 4... 


zou ESIIMAIINYG PARAMETERS EKUM UBSEKV ATIUINAL DATA 


(a) Determine the best estimates of the outer and inner radii, and cor- 
responding standard errors. 

(b) The shaded area between the two concentric circles is computed based 
on the mean values of the measured outer and inner radii; namely 
A = n(F? — FP). What is the computed area? Ans. 12.57 ém?, 

(c) Determine the standard deviation (standard error) of the computed area. 
Ans. 0.819 cm?. 

(d) If it is desired to determine the sample mean of r, within +0.07 cm 
with 99% confidence, how many additional independent measurements 
must be made on,r,? Assume that all measurements are independent and 
taken with the same care and skill, Ans. 12. 


5.15 The distance between A and C is measured in 2 stages: namely, AB and BC as 


shown in Fig. P5.15. Measurements on AB and BC are recorded as follows: 


AB: 100.5, 99.6, 100.1, 100.3, 99.5 ft 
BC: 50.2, 49.8, 50.0 ft 


(a) Compute the-sample mean and sample variance of the measured dis- 
tances for AB, 


(b) Compute the standard error of the estimated distance of AB, that is, 525. | 


(c) Establish the 98% confidence interval for the actual distance AB. 
(d) If the distance AC is given by the sum of the estimated distances 4B and 


BC, that i is, 

AC = 4B + BT 
what is the standard error of the estimated total distance between A 
and C? ! 


. (e) Establish the 9896 confidence interval on the actual length. AC. 


— € 
A 8 € 


Figure P5.15 


6. Empirical Determination 
of Distribution Models 


6.1. INTRODUCTION 


The probabilistic characteristics of a random phenomenon is sometimes 


‘difficult to discern or define, such that the appropriate probability model 


needed to describe these characteristics is not readily amenable to theo- 
retical deduction or formulation. In particular, the functional form of the 
required probability distribution may not be easy to derive or ascertain. 
Under certain circumstances, the basis or properties of the physical process 
may suggest the form of the required distribution. For example, if process 
is composed of the sum of many individual effects, the Gaussian distribu- 
tion may. be appropriate on the basis of the central limit theorem; whereas, 
if the extremal conditions of a physical process are of interest, an extreme- 
value distribution may be a suitable model. 

Nevertheless, there are occasions when the required probability distri- 
bution has to be determined empirically (that is, based entirely on avail- 
able observational data). For example, if the frequency diagram for a set 
of data can be constructed, the required. distribution model may be deter- 
mined by visually comparing a density function with the frequency diagram 
(see for example, Figs. 1.5 through 1.7). Alternatively, the data may. be 
plotted on probability papers prepared for specific distributions (see Section 
6.2 below). If the data points plot approximately on a straight line on one 
of these papers, the distribution corresponding to this paper may be an 
appropriate distribution model. 

Furthermore, an assumed probability distribution (perhaps determined 
empirically as described above, or developed theoretically on the basis of 


prior assumptions) may be verified, or disproved, in the light of available ` E 


data using certain statistical tests, known as goodness-of-fit tests for distri- 
bution. Moreover, when two or more distributions appear to be plausible 
probability distribution models, such tests can be used to delineate the 
relative degree of validity of the different distributions. Two such tests are 
commonly used for. these purposes—the chi-square (x) and the Kol- 


i Mogorot Smirnov (K-S) tests. . 


^. " 26r 


kva — . LUE InAGAL DEL EKMLNATION OF DISTRIBUTION MODELS 


In practice, the choice of the probability distribution may also be 
dictated by mathematical tractability or convenience. For example,. be- 
cause of the mathematical simplifications possible with the normal distribu- 
tion, and the wide availability of information (probability tables) associ- 
ated with this distribution, the normal (or log-normal) distribution is 
frequently used to model nondeterministic problems—at times, even when 
there is no clear basis for such a model. Probabilistic information derived 


on the basis of such prescribed distributions could be useful, especially when - 


the information is needed only for relative purposes. 
6.2, PROBABILITY PAPER 


- Graph papers for plotting observed experimental data and their correspond- 
ing cumulative frequencies (or probabilities) are called probability papers. 
Probability papers are constructed such that & given probability paper is 
associated with a specific probability distribution; thàt is, different prob- 

‘ability papers correspond to different probability distributions. $ 

Preferably, a probability paper should be constructed using a trans- 
formed probability scale in such a manner as to obtain a linear graph be- 
tween the cumulative probabilities of the underlying distribution and the 
corresponding values of the variate. For example, in the case of the uniform 
distribution, the cumulative distribution function is linearly related. to the 
values of the variate; thus the probability paper for this distribution would 
be constructed using arithmetic seales for the values of the variate and the 
associated cumulative probabilities (between 0 and 1.0). For other distribu- 
tions, however, special scales are required for the cumulative probabil- 
ities in order to achieve the desired linear relationship. - 

The linearity, or lack of linearity, of a set of sample data plotted on a 
particular probability paper, therefore, can be used as a basis for determin- 
ing whether the distribution of the underlying population is the same as 

"that of the probability paper. On this basis, then, probability papers may 
be used to establish or explore the possible distribution (s) of the underlying 
population. In Sections 6.2.1 and 6.2.2 we illustrate the construction and 
application of two commonly used probability papers—the normal and the 
log-normal probability papers, and in Section 6:2.8 we describe the con- 
struction of probability papers for a general distribution. 

Experimental data may be plotted on any probability paper; the plotting - 
position of each data point is determined as follows. ` MN 

If there arè N observations d, 25, +++, ay, the mth value among the N ob- 
servations (arranged in inereasing order) is plotted at the cumulative prob- 
ability m/(N-+ 1). . E 

This plotting position applies to all probability papers; its basis is dis- 
cussed, in Gumbel- (1954). Although there are other plotting positions, 
such as (m — $)/N, which was advocated by Hazen (1930) and has been 


6.2. PROBABILITY PAPER 263 


Volues Of X (in arithmetic scale) 


Figure 6.1 Construction of normal probability paper 


used widely, this plotting position has certain theoretical weakness; in 
particular, when there are N observations, the plotting position (m —4)/N 
would yield a return period of 2N for the largest observation instead of 
N (Gumbel, 1954). Still other plotting positions have been suggested 
(for example, Kimball, 1946); however, none seems to have the theoretical 
attributes and the computational simplicity of m/(N + 1). 


6.2.1. The normal probability paper : . 


The normal (or Gaussian) probability paper is constructed on the basis 
of the standard normal distribution function as follows. One axis (in 
arithmetic scale) represents the values of the. variate X, as illustrated 
in Fig. 6.1. On the other axis are two parallel scales; one in arithmetic 
scale represents values of the standard normal variate s, Whereas the other 
shows the cumulative probabilities @s(s) corresponding to the indicated 
values of s as shown in-Fig. 6.1. A normal variate X with distribution 
N (p, c) would then be represented on this paper by a straight line passing 
through $s(s) = 0.50 and. X = », with a slope (z, — m)/s = c; where | 
2» is the value of the variate at probability p. In particular, at p. = 0.84, 
s = 1; hence, the slope is (£. — p) 

Such normal probability papers are available commercially. The scale for 
‘the standard vatiate s, however, is usually omitted in such commercial 
papers. uE oN " zo 

Any set of data may be plotted on the normal probability paper; however, À 
if the resulting graph of data points shows a lack of linearity; this would 
suggest that the underlying population is not Gaussian. Conversely, if the 


264 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


data points plotted on this paper show a linear ar approximately linear 
trend, the straight line through these data points represents a 'specifie 
normal distribution applicable to the data set (at least within the range of 
observations). The mean value and standard deviation of the underly i 

population may also be determined graphically from this straight line = 
the value of X on this line corresponding to $s(s) =-0.50 is the smale 
of the mean value ux, whereas the slope of the straight line is the e iius 

of the standard deviation ox; thus ex œ za — ux (see Fig. 6.1) pon 


EXAMPLE 6.1 


The data for fracture toughness of steel plate, given i 
the normal probability poner in Fig. E an NN S 


Table E6.1. Fracture Toughness of Base 
Plate of 18% Nickel Maraging Steel (Data 
From Kies et al., 1965) 


i m 
m Ky, (ksi Vin.) NI 
1 695. > 0.0370 
2 71.9 . 0.0741 
3 72.6 0.1111 
4 73.1 0.1418 
5 733 0.1852 
$ 73.5 0.2222 
7 744 0.2592 
8 74.2 0.2963 
9 75.3 0.3333 
10 755. 0.3704 
n 75.7 0.4074 
12 75.8 0.4444 
13 76.1 0.4815 
.M 76.2 0.5185 
13 762 . 0.5556 
16. 76.9 0.5926 ç 
17 770 0.6296 
18 77.9 0.6667 
19 78.1 0.7037 
20 79.6 0.7407 
2i 79.7 0.7778 
2 79.9 0.8148 
23 80.1 0.8518 
24 822 . 0.8889 
25 .o083.T 0.9259 


126 93.7 9.9630 


6.2. PRUBABILILY raron sue 


Frocture Toughness, Kig, ksi Vin. 


Cumulotivà Probability = xr 


Figure E6.1 Fracture toughness plotted on normal probability, paper 


In Fig. E6.1,-values of the fracture toughness Kye are plotted against the plotting 


` positions m/(N + 1), with N = 26. ; k 
- The straight line shown in Fig. E6.1 is drawn (by cye) through the data points, 


* from which we find ug, = 77 ksi V in. Also, we observe that the value of Kr, at the 
849% probability level is 81.6; thus ac, = 81.6 — 77 = 4.6 ksi Vin. ; 


6.2.2. The log-normal probability ‘paper 


^. The logarithmic normal probability paper can be obtained from the normal 


probability paper by simply changing the arithmetic scale for values of the 
variate X (on the normal probability paper) to a logarithmic scale. The 
resulting paper would be as shown in Fig. 6.2. In this case, the standard 


normal variate becomes , 2. 
ge In gum ps. 

where £m is the median of X. .. . 
Tf a random phenomenon can be modeled approximately with a log- 
- normal distribution, then experimental data obtained therefrom should be 
approximately linear Hen the mth value among N observations and their 
plotting positions m/(N + 1) are plotted on the log-normal probability 
paper. If the plotted data points yield a straight line, this line represents. 
the particular log-normal distribution for the underlying population. 


200 EMPIRICAL DETERMINATION OF DIST RIBUTION MO. j 
MODELS 


Accordingly, the median 2» is simply the value of the variate on this line 


Lognormol (297 7, g . corresponding to @s(s) = 0.50; whereas the parameter t is given by the 
e slope òf the line, that is, i 
3 Slope = Ln (xqgq/Xm) EE a in (& za (2 
E = fn (84/74) = O12 s Tm, Ly 
H EXAMPLE 6.2 "Es 
F Data for the fracture toughness of MIG welds are tabulated in Table E6.2. 


The plotting positions m/(N + 1) are shown in column 3 of Table E6.2; these are 
plotted. against the fracture toughness Kr, on log-normal probability paper in 
Fig E62. |." : 

. Onthe basis of the linear graph of the plotted data shown in Fig. E6.2, we may say ^ 
that the fracture toughness of such welds has a log-normal distribution. Specifically, 


t2 5 10 2 40 60 80. 90 95 98 95 
Cumutotive Probobility.s — the:straight line drawn (by eye) through the data points represents a log-normal 


999 99,99 


N+! 


7 ; ' distribution with a median of 74 ksi V’ in. and a COV of 12%. 
Figure E6.2 Fracture toughness of welds plotted on log-normal probability paper , 


Table E6.3.' Precipitation and Runoff Data for Example 6.3 


m 
f m Precipitation X (in. Runoff Y (in. N+1 
Table E6.2. Fracture Toughness of MIG cmo Precipitation x (a). Runot Np. NT, 
Welds (Data from Kies et al., 1965) 1. 1.01 017 0.038 
————————————————— 2 111 0.1 0.077 
= m 3 1.13 0.23 , 0.115 
LIU CLR 5 16 039 0.192 
lc 544 0.05 6 147 : 0.39 0.231 
2 62.6 040^ 7 1.17 0.40 0.269 
3 63.2 0.15 8 1.20 0.45 0.308 
4 67.0 0.20 9 1.52 0.52 0.346 
5 ` 70.2 : 0.25 10 154 0.56. , 0.385 
6 70.5 0.30 . "c 1.54 0.59 0.423 
7 70.6 0.35 12 . 1.57 0.64 0.462 
8 714 0.40 13 1.64 ` 0.66 0.500 
9 71.8 0.45 14 1.73 0.70 0.538 
10 741 0.50 | 145€ 1.79 0.76 0.577 
H 741 0.55 | 16 2.09 (on 0.615 
NA 743 0.60 17 2.09 2078... 0.654 
13 78.8 065 : 18 2.57 0.95 0.692 
4. 81.8 0.70 ‘ 19 | 215 097 . 0.731 
15 83.0 0.15 20 2.93 1.02 0.769 
16, 84.4 . 0.80 21 3.19 1.12 0.808 
17. 85.3 : 0.85 _ 22 3.54 1.24 0.846 - 
18, 86.9 090 - 2 3.57. 1.59 - 0.885 
19 < 873 > . 0.95 24 > 5.11 1.74 0.923 


268 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


Precipitation, inch 


obi m “12 5 | 2 40 60 80 6-90 95 98 99 399 9999 


Cumulotive: Probability = Sr 


Figure E6.3a Precipitation plotted on log-normal probability paper 


_ EXAMPLE 6.3 


Plot on log-normal probability papers the precipitation and runoff data of t 
Monocacy River described in Example 5.8. LS d AST podio 
Rearranging the data given in Example 5.8 in irireasing order; i 
ERE gi p! g or er, we obtain 


H 


aj- ; 
F g = inf! 45/0 66)=0.79 * | 

al- 

E X ga 1.45 
2 l i 
g : J 
MEE ; 
g >| | m2 066 ] 
$ oe 4 
č 

M | 

02 J 

i 
gi Ltt tt bog: ido. : : Ll | 
do O4 (2 5 0 2 40° 60° 80 90 95 9899 999 9999 | 


Cumuloteve Probability = NN 


-Figure E6.3b Runoff plotted on log-normal probability ‘paper = 


6.2. PROBABILITY PAPER 269 


The precipitation values are plotted against the respective plotting positions 
m[(N + 1) onlog-normal probability paper in Fig. E6.3a. Similarly, the runoff data 
are plotted on log-normal probability paper in Fig. E6.35. From these two figures, 
the following may be inferred. — . ] : A 

(a) The absence of linearity in the graph of Fig. E6.3a means that the distribution _ 
of precipitation in the Monocacy River basin is not log-normal. Oh 

(b) On the basis of Fig. E6.35, the runoff of the Monocacy River may be described 


' with a log-normal distribution, with median xm = 0.66 in. and parameter i = 


1n(1.45/0.66) = 0.79. 


6.2.3. Construction of general probability paper 


As indicated earlier, probability papers are constructed in such a way 
that the values of the variate and the associated cumulative probabilities 
yield a straight line on a two-dimensional graph; conversely, therefore, a 
straight line on a specific probability paper represents a particular dis- 
‘tribution (consistent with that of the probability paper) with given values 
of the parameters. For this purpose, a probability paper should be con- 
structed so that it is independent of the values of the parameters of the 
distribution. This is accomplished by defining a standard variate (if one 


,exists) appropriate for the given distribution. . 


In the last two sections, we illustrated the construction and application 
of the normal and log-normal probability papers; similar papers may be 
constructed for other distributions. We illustrate this with the following 
examples. 


EXAMPLE 6.4 . , 
The density function of the (shifted) exponential distribution, Eq. 3.43, is 
fxQ) = es 99; x xa l 
i =0; x«a 


where Ais the parameter, and ais the minimum value of X. In this case, the standard 
variate is S = A(X — a). The density function of S then, according to Eq. 4.6, is 


fa) = f 2) |; 


ii] 
i es s20 
=0; . 


‘ s <0 
with corresponding CDF " : 
Fg(s —1 — e; s20 


On this basis, therefore, we construct the exponential probability paper as follows. 
. On one axis, scale values (in convenient arithmetic scale) of the standard variate s; 
on the same-(or a parallel) axis, mark the corresponding cumulative probabilities 
Fg(s) = 1 — e-*. The other (perpendicular) axis will represent values of the variate 
X (in arithmetic scale). For illustration, specific values of s and Fg(s) have been 
calculated as summarized in Table E6.4a. ¥ 

Drawing grid lines for given Fs(s) at the indicated values of s shown in Table 


n 


, distribution, 


270 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


Table E6.4.a Specific Values of s and Fg (s) 


s ' Fg(s) E Fols) 
0.11 0.10 2.53 0.92 
0.22 0.20 2.66 0.93 
0.36 0.30 281 0.94 
0.51 0.40 3.00 0.95 
0.69 0.50 3.10 0.955 
0.80 ` 0.55 3.22 0.96 
0.92 0.60 3.35 0.965 
1.05 0.65 3.5] 0.97 
1.20 0.70 3.69 0.975 
1.39 0.75 3.91 0.98 
1.61 0.80 4.20 0.985 
4.90 0.85 4.61 0.99 
2.30 0.90 


E6.4a, we obtain tlie resulting paper, as shown in Fig. E6.4a. A straight line.(with 


positive slope) on this paper represents a particular exponential distribution, in 
which its intercept on the x-axis is the value of a, and its slope is 1/4. 

Sample values from an exponential population should plot, using plotting position 
m[(N + 1), approximately on a straight line on this probability paper. To illustrate 
this, consider the hypothetical set of data in Table E6.45 for a random variable X. 
The mth observed value and corresponding plotting position m(N -- 1) are shown 
plotted in Fig. E6.45. From the straight line drawn (by eye) through the data points 
in Fig. E6.45; we obtain estimates of the parameters a ~ 150 arid 1/A = 2000/2.69. = 
743.. : 

In the present case, however, the purposes of a probability paper can be accom- 
plished also with a semilogarithmic paper. Observe that for the exponential 


be nd 1 -— Fy (x) = ex 
_ thus, "E ] 
In [I — Fx(9)] = —Ax 


Therefore, by scaling 1 — Fx(x) on one axis (in logarithmic scale) and x on the 
other axis (in arithmetic scale), the graph:between 1 — Fy(x) and x isa straight line 


on this paper with a slope of 2. On this paper, however, sample data should be 
plotted at the plotting positions i — mKN +1). 


EXAMPLE 6.5 (The Gumbel probability paper) 


One of the extreme-value distributions (see Vol. II) is the Type I asymptotic 
distribution of extremes, known also as the Gumbel distribution. Its CDF for the 
largest value is given by the double exponential function. d 


Fx(x) = exp [— 77«(e-u)] | —0€«x«o 


in which «v is the characteristic largest value, and 1/« is a measure of dispersion. 


Table E6.4.6 Sample Values of X 


i Plotting position 


m 
- m N+1 
i 0.024 
ed 2 0.049 
203 3 0.073 
212 5 0.122 
248 1: 0.171 
389 16° - 0.390 
1331 35 0.854 
1031 33 0.805 
208 4 0.098 
226 6 0.146 
289 10 0.244 
5433 20 0.488 
360 15 : 0.366 
1635 37 0.902 
559 21 0.512 
909 28 0.683 
408 AT 0.415 | 
` 2497 39 0.951 
774 24 0.585- 
946 29 0.707. 
2781 40 0.976 
308 12 . 0.293 
274 d 0.220 
531 19 0.463 
460 18 0.439 
791 26 . 0,634 
952 30 0.732 
1844 - 38 0.927 
952 31 0.756 
1427 36 ' 0.878 
306 1 0.268 
787 25 0.610 
254 8 0.195 
772: 23. 0.561 
842 27 0.659 
981 32 0.780 
1122 34 0.829 
611 22 0.537 
332 13 0.317 
14 0.341 


343 


271 


Values Of x (in arithmetic scale ) 


` Figure E6.4a Construction of the exponential probability paper 


Values Of x 


i000! 


o 
12345 6 7 8 85 9 92 96 97 98 


E —— MH +, 1 


Figure E6.4b Sample values of X plotted on exponential probability paper 
272 E 


99 Fols) 


s 


EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS E 


Table E6.5. Specific Values of s and Fs(s) ` 


s Fg(s) s Fg(s) s Fg(s) 
-1.53 '0.01 0.37 0.50 2.48 0.92 
—1.10 0.05 0.51 0.55 2.62 0.93 
—0.83 0.10 0.67 0.60 2.78 0.94 
—0.64 0.15 0.84 0.65 2.97 0.95 
—048 - 020 1.03 0.70 3.08 0.955 
—0.33 0.25 1.25 0.75 3.20 0.96 
—0.19 0,30 1.50 0.80 3.33 0.965 
—0,05 - 035' 1.82 0.85 3.49 0.97 

0.09 0.40 ` 2.25 090 | 3.68 0.975 

0.45 -2.36 0.91 3.90 0.98 


ı 023 


In'this case, the standard variate can be defined as 


"Then 


S: «(X — u) 


Fg(s) = exp (—e7*) 


Using the specific values of s and corresponding probabilities Fg(s) calculated as 


“summarized in Table E6.5, we construct the Gumbel probability paper as follows. 


Scale the values of s on one axis and the associated probabilities Fs(s) on the same 


(or a parallel) axis as shown in Fig. E6.5. The other axis in Fig. E6.5 represents 


Values Of x ( in arlthmetic scale } 


105 407 


Figure E6.5 Construction o 


20 308.40 5 


Gumbel probability paper - 


Us 


274 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


values of X, in arithmetic scale. The result is the Type I extremal probability paper, 
also known as the Guinbel probability paper. 

Again, a straight line on this paper represents a particular Type I extremal 
distribution—the value of X on this line at Fg(s) = e! = 0.368 (or s = 0) is the 
characteristic largest value x, whereas the slope of the straight line is «, as illustrated 
in Fig. E6.5. 


6.3. TESTING VALIDITY OF ASSUMED DISTRIBUTION 


When a theoretical distribution has been assumed, perhaps determined 
on the basis of the general shape of the histogram or on the basis of the 
data plotted on a given probability paper, the validity of the assumed 
. distribution may be verified or disproved statistically by goodness-of-fit 

_ tests. Two such tests for distribution are available—the chi-square and the 
Kolmogorov-Smirnov methods; one or the other of these is generally used to 
test the validity of an assumed distribution model. 


6.3.1. Chi-square test for distribution 


Consider a sample of n observed values of a random variable.. The chi- 

` square goodness-of-fit test compares the observed frequencies n, ne, +++, n; 
of k values (or in k intervals) of the variate with the corresponding fre- 
quencies €s, €z, «+, ex from an assumed-theoretical distribution. The basis 
for appraising the goodness of this comparison is the distribution of the 
quantity 


k ln, — e)? 
> (ni - ei) 
i=l i 
which approaches the chi-square (x7) distribution with (f = k — 1) 
_ degrees of freedom as n — © (Hoel, 1962). However, if the parameters of 
Xhe theoretical model are unknown and must be estimated from the data, 
: the above statement remains valid if the degrée.of freedom is reduced by 
one for every unknown parameter that must be estimated. 
On this basis, if an assumed distribution yields ' ' 


: k ;P— e a 
: Eb» iiec : . (6) 
i=l ĉi i 
where c, is the value of the appropriate x} distribution at the cumu- 
lative probability (1 — e), the assumed theoretical distribution is an 
acceptable model, at the significance level œ. Otherwise, the assumed dis- 
bution is not substantiated by the data at the o significance level. ' 


In applying the x? test for goodness of fit, it is generally necessary (for - 


satisfactory results) to have k > 5 and e; > 5. 


6.3. TESTING VALIDITY OF ASSUMED DISTRIBUTION 275 


Predicted (Based On Poisson 
ur Distribution With v= 197) 


Number Of Years - 


Number Of Storms In A Year 


` Figure E6.6 Histogram and Poisson model for storm occurrences 


EXAMPLE 6.6 


Suppose that severe rainstorms have been recorded at a given station over a period 
of 66 years. During this period, there were 20 years without severe rainstorms; and 
23, 15,6, and 2 years, respectively, with 1,2,3, and 4 rainstorms annually. The histo- 
gram for the annual number of rainstorms recorded at the station is shown in Fig. 
E6.6. Because the occurrence of severe rainstorms is random, and judging from the 
shape of the histogram, a Poisson distribution seems an appropriate model for the 
annual number of rainstorms at the given location (station). In particular; on the 
basis of the data, we estimate the average occurrence rate of rainstorms annually as 


$ eggs xX246x342x4) 


79 ; : 
== 1.20 rainstorms/year 

We now apply the chi-square test to determine whether the Poisson distribution is 
a suitablé model, at the 5% significance level. In this case, since four storms in a year 
was observed only twice, this data is combined with the data for three storms a year; 
thus, k = 4. Since the parameter v is estimated by $, the quantity Ma (n; — eYle; 
has a.x? distribution with f = k — 2, = 2 degrees of freedom. Based on the com- 
putations summarized in Table E6.6, & (nj — ejY[e; => 0.068, which is less than 
€.95, = 5.99. Hence the Poisson distribution is a valid model for the annual number 
of rainstorms, at the 5% significance level. : 


276 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


‘Table E6.6. x7 Test for Storm Occurrence 


No. of storms Observed ^ Theoretical 


— e 
at station frequency frequency £ @; — e) 
peryear n e; ^on — es) ei 
0 20 19.94 ` 0.0036 - 0.0002 . 
1 23 23.87 0.7569 0.0317 
2 15 14.29 0.5041 0.0353 
23 8 790 . | 0.0100 0.0013 
^ 66 66.00 0.0685 
EXAMPLE 6.7 


Consider the frequency diagram for the crushing strength of concrete cubes shown 
'in Fig. E6.7. Visually, on the basis of the frequency diagram and the theoretical 
distributions shown in Fig. E6.7, both the normal and the log-normal density 
functions appear to be suitable models for the concrete strength. 

In-this case, the 7?-test will be used to determine the relative goodness of fit 
between the two candidate distributions. : 7 
: For the purpose of this example, 8 intervals of the strength are considered as 
shown in Table.E6.7. 


08 
} p= 750 ksi 
06 


04 


o2 


Crushing Strength , ksi 


Figure E6.7 Frequency diagram of crushing strengths of concrete cubes (data 
from Cusens and Wettern, 1959) i i 


6.3. TESTING VALIDITY OF ASSUMED. DISTRIBUTION 277. 


Table E6.7. Chi-Square Test for Relative Goodness-of-fit 


! Theoretical frequency Gu — ey 
Observed ei . ei 

Interval frequency 

(ksi) nj Normal Lognormal Normal Log normal 

<6.75 9 11.1 9.9 .0.40 0.09 
6.74—7.00 17 13.2 14.0 1.09 0.92 
7.00-7.25 22 211 22.1 0.04 0.00 
7.25-7.50 31 26.1 26.9 0.92 0.62 
7.50-7.75 28 26.1 25.6 0.14 0.23 
7.15-8.00 20 :21.0 19.8 0.05 0.00 
8.00-8.50 9 20.2: 19.4 6.22 $5.57 

«8.50 P 4.2 53 - 1.87 0.54 


143.0 143.0 1073 197 


The observed and theoretical frequencies within the indicated intervals are sum- 
marized in Table E6.7. i 

In both cases (that is, the normal and the log-normal distributions), the respective 
párameters were estimated by the sample mean and sample variance; hence the net 
number of degrees of freedom for either distribution is f = 8 — 3 — 5. At the 
significance level « = 5%, we obtain from Table A.3, ¢,95,5 = 11.07. Comparing 
this with the values of Ð (n, — effe; calculated in Table E6.7, we observe that 
although both distributions appear to be valid models for the concrete strength (on 
the basis of the frequency diagram of Fig. E6.7), the log-normal model is superior to 
the normal model according to this test, because 7.97 « 10.73: 


It should be emphasized that because there is arbitrariness in the choice 
of the significance level a, the x? goodness-of-fit test (as well as the Kol- 
mogorov-Smirnov method described subsequently): may, not provide 
absolute information on the validity of a specific distribution. For example, 
it is conceivable that a distribution that is acceptable at one significance 
level may be unacceptable at; another significance level. In spite of this, 
however, such tests remain useful, especially for determining the relative 
goodness of fit of two or more theoretical distributions, as illustrated in 
Example 6.7. oa ; : : 


6.3.2. Kolmogorov-Smirnov test for distribution 


Another widely used goodness-of-fit test is the Kolmogorov-Smirnov 
(K-S) test. The basic procedure, involves the comparison between the ex- 
perimental cumulative frequency and an assumed theoretical distribution 
function. If t: — '*erepancy is large with respect to what is normally ex- 
pected from.a given sample size, the theoretical model is rejected. i 
For a sample of size n, rearrange the set of observed data in increasing © 


278 — - EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


Sp(x) F(x) 
i} 


^ Xy 75 X mE 


“Figure 6.2 Empirical cumulative frequency vs. theoretical distribution function 


order. From, these ordered sample data we develop a stepwise curnulative l 


frequency function as follows: 


0 É s< w 
k NC 

S. (x) -—$ a t Tk < 2 X Zea i (6.2) 
1 D> En 


where £i, %2, +-+, % are the values of the ordered sample data, and n is 
the sample:size. Figure 6.2 shows a plot of S,(x) and also the proposed 
- theoretical distribution function F(x). In the Kolmogorov-Smirnov test, 


the maximum difference between S,(z) and F(x) over the entire range of" 
X is the measure of discrepancy between the theoretical model and the : 


‘observed data. Let this maximum difference be denoted by 


D, = max | F(z) — S,(2) | t (6.3) . 


Theoretically, D, is & random variable whose distribution depends on n. 
. For a-specified significance level o, the K-S test compares the. observed 
maximum difference of Eq. 6.3. with the critical value Dj, which is de- 


. 6.3, TESTING VALIDITY OF ASSUMED. DISTRIBUTION: | 2797 


fined by ` 
; P(D; < DI) =T- g. 

Critical values D3 at various significance levels & are tabulated in Table | 
A.4 for various values of a. If the observed D, is less than the critical 
value D$, the proposed distribution is acceptable. at the specified sig-. 
nificance level œ; otherwise, the assumed distribution would be rejected. 

The advantage of the Kolmogorov-Smirnov (K-S) test over the chi- 
square (x?) test is that it is not necessary to divide the data into intervals; 
hence the problems -associated with the chi-square approximation for 


small e; and/or small number of intervals k would not appear with the 
K-S test. ] U i 


EXAMPLE 6.8 


The data for fracture toughness of steel plate ia Example 6.1 have been plotted on . 
normal probability paper as shown in Fig. E6.1. The data appear to fall approxi- - 
mately on a straight line corresponding to a norinal model N (77, 4.6).-Perform a 
Kolmogorov-Smirnov test to evaluate the appropriateness of this model relative to 


the given data, at the 5% significance level. 


The sample cumulative frequencies are plotted according to Eq. 6.2 in Fig. E6.8. 


-30 


] p kii ` 
Fi igure E68 Cumulative distribution of fracture toughness data 


280 - EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


' ctical distributi i ; propose 1 model N (77, 4.6) is; 
: theoretical distribution function for the proposed norma, 6) is. 
"al shown in the same figure. The maximum discrepancy between the two functions 
is D, = 0.16 and occurs at Ky, = 77 ksi Vin. In this case, e fe pecie 
"points; hence the critical value of Dz at the 5% significance evel is Dz? n 
(obtained from Table A.4). Since the maximum discrepancy of 0.16 is less an 
D: = 0.265, the normal model N (77, 4.6) is verified at the 5% significance level. 
26 ^" E 


EXAMPLE 6.9 


Data for stream temperature at mile 41.83 of the Little Deschutes River in Oregon 
measured at 3-hr intervals over a 3-day period (August 1-3, 1969), are shown plotted 
in Fig. E6.9 in accordance with Eq. 6.2. The distribution function of the propo: 

tical model is also shown in the same figure. F. , 

prope the maximum difference between S,(x) and F(x) is observed to be 
= 0.174 at the temperature of 70.9^F. bee : 

De ith n = 23, the critical value Dg at the 5% significance level, obtained from 

Table A.4, is D45 = 0.273. Since D, < Dg , the proposed theoretical disuibution 

ds suitable for modeling the stream temperature of this river at the significance level 


of a= 5%, 


[Ee] 7 
Little Deschutes R. Oregon 
River Mile 41,83 
O.9[— 1 .Aug.-3 Aug. (969 
0.8 
0,7 
£ o6 
5 
5 
u 


Cumulative Distribution 


03 
: K/S Statistic 0.174 
og Critical Val, 0:273 
Ol o Predictėd 
' ^ n Observed 


70 7i ; 72 72:5 


o Hs : 
-e3464 65 66 67 68 69 m 
i Ordered Random Samples; *F d 
Figure E6.9 ' Kolmogorov-Smirnov test for proposed stream temperature predic- 


tion model (after Morse, 1972) 


PROBLEMS 281 


6.4.4. CONCLUDING REMARKS 


Whereas Chapter 5 was concerned with the statistical estimation of 
parameters of a distribution, this.ehapter is concerned with the determina- 
tion of the probability distribution for a random variable, and with ques- 
tions related to the vdlidity of an assumed distribution; based on finite 
samples of the population. Unless developed theoretically from physical 
considerations, the required distribution model may be determined em- 
pirieally. One way of doing this is through the use of probability papers 
constructed for specific distributions. The linearity, or lack of linearity of 
sample data plotted on such papers would suggest the appropriateness of a 
given distribution for modeling the population. . 

The validity of an assumed distribution may also be appraised by good- 
ness-of-fit tests, including specifically the chi-square (3?) and the Kol- 
mogorov-Smirnov (K-S) tests. Such tests, however, depend on the pre- 
scribed: level of significance, thé choice of which is largely a ‘subjective 
matter. Nevertheless, these tests are useful for determining: (in the light of 
sample data) the relative appropriateness of several potentially possible 
distribution models. ` ! : 


PROBLEMS 
6.1 Plot the data in Example 6.1 on log-normal probability paper. Estimate the 
median and COV from the straight line drawn throügh the data points. 


6.2 The ultimate strains (Eq, in %) of 15 No. 5 steel reinforcing bars were meas- 
ured. The results are as follows (data from Allen, 1972): 


19.4 °°. 17.9 i 16.1 
16.0 128 . 16.8 
16.6 18.8 17.0 
17.3 3 201 | ° ` IBI 
18.4 . 194 : ` 18.6 


Plot these data on both the normal and the log-normal probability papers, 
and discuss the results. ) 


6.3 The shear strengths (in kips per Square fect, ksf) of 13 undisturbed samples of 
`. clay from the Chicago subway project are tabulated as follows (data from’ 
Peck, 1940): 


0.35 042 .'-. 049 0.70 0.96 . 
0.40 0.43 0.58 0.75 
041 048. ^. 0.68 0.87 


' "Plot the data on log-normal probability paper. Estimate the parameters of the 
log-normal distribution to describe the shear strength of Chicago clay. 
6.4 For the wind velocity data in Problem 3.25 (of Chapter 3), plot the data on 
normal probability paper. Determine the normal distribution for describing 
the wind velocity. j 


6.5 A random variable with a triangular distribution between a and a +r, as 


282 EMPIRICAL DETERMINATION OF DISTRIBUTION MODELS 


shown in Fig. P6.5, is given by the density function 


face T9. asssasr 


=0; elsewhere. 
(a) Determine the appropriate standard variate S for this distribution. 
(b) Construct the corresponding probability paper. What do the values of X 
at Fg(0) and F,(1.0) on this paper mean? ` f : 
. (©) Suppose the following sample values were observed for X. , 


36 32 34 > 71 


18 69 45 66 
56 Un 53^ '58 
64 50 55 ' $53 
72 28 62 48 

m as ! 75 


Plot the above set of data on the triangular probability paper. From this plot 
estimate the minimum and maximum values of X. 


,6.6 The density function of the Rayleigh distribution is given by 
x 
fxG) = pete"; x20 


=0; . x«0 
in which the parameter « is the modal (or most probable) value of X. 
(a) Construct the probability paper for this distribution. What does the 
slope of a straight line on this paper represent? 
(b) The following is a set of data for stress range induced by vehicle loads on 


highway bridge members. 
< Strain Range, in micro in.[in. (Data courtesy of W. H. Walker) - 
48.4 52.7 42.4 
47.1 Ec 44.5 146.2 
49.5 ] 84.8 115.2 
116.0 . 52.6 43.0 
84.1 53.6 103.6 . 
99.3 PANS 64.7. 
108.1 43.8 69.8 
413. 56.3 44.0 
93.7 ' 134.5 ' 362 
36.3 d 62.8 50.67" 
122.5 i 180.5 a.) 167.0, 


Plot this set of data on the Rayleigh probability paper cònstructed in 
part (a). : : i, 

(c) What'inference cari you draw regarding the Rayleigh distribution as a 
model for live-load stress range in highway bridges, in light of the data 
plotted above? Detérmine the most probable stress range (if possible) 
from the results of part (b): i i 


i 


"6.3. TESTING VALIDITY OF ASSUMED DISTRIBUTION 


fy(x) 


Figure P6.5 i 


recorded as follows (in hours). 


121.58 


283 


6.7 ` Time-to-failure (or, malfunction) of a certain type of diesel engine has been 


0.13 2959.47 102.34 

.70978 |. 672.87 124.09 393.37 

3,55 62.09 + 8528 184.09 

- 14.29 656.04 380.00 1646.01 

54.85 735.89 298.58 412.03 

216.40 . : 895.80 678.13 813.00 

| 129693 >. 1057.57 .. 861.93 .239.10 

952.65 470.97 1885.22 2633.98 

8.82 151.44 862.93 658.38 
29.75 - 163.95 1407.52 


(a) Construct the exponential probability paper (see Example 6.4) and plot 


on it the data given above. 


855.95 


o0 On the basis of the results of part (a), estimate the minimum and mean 
time-to-failure of such engines. ; Í 
(c) Perform a chi-square test to determine the validity of the exponential 
distribution at the 1% significance level. f 


6.8 The following are observations of the number of vehicles per minute arriving 
at an intersection from a one-way street: j ^ ; 


0, 3, 1, 2, 0; 1; 1,1,2,0,1,4, 3, 1, 1, 0, 0, 1, 0, 2 


Perform a chi-square test to determine if the arrivals can be-modeled by a 


Poisson process, at the 1% significance level. 


6.9 Cars coming toward an intersection are i¢quired to stop at the stop sign 
` before they find a gap large enough to cross or to make a turn. This acceptance 
gap G, measured in seconds, varies from driver to driver, since some.driverg 
‘may be more alert or more risk-taking than others. The following are 
measurements taken for several similar intersections. EE 


] 
i 


Z Regression and 
Correlation Analyses 


` When dealing with two ‘or more variables, the functional relation between 
the variables is often of interest. However, if.one or both variables (in a 


two-variable case) are random, there will be no unique relationship be-: 


tween the values of the two variables—given a-value of one variable (the 


controlled variable), there is a tange of possible values of the other—and : 


thus a probabilistic description is required. If the probabilistic relationship 
between the variables is described in terms of the mean and variance-of 
one random variable-as a function of the value of the other variable, we 
have what is known as regression analysis, When. the analysis is limited to 
linear mean-value functions, it is called linear regression. In general, how- 
ever, regression may be nonlinear. In some cases, nonlinear regression 
problems may be converted to linear ones by appropriate transformation 
of the original variables. 

In the following, we presenti the. concepts of regression analysis (in- 
cluding nonlinear regression and multiple linear regression), and their 
applications to engineering problems.. ` 


Tl. BASIC FORMULATION OF LINEAR REGRESSION 


7.1.1. Regression with constant variance ' 


When pairwise data for two variables, say X and Y, are plotted on a two- 
dimensional graph, such as shown in Fig. 7.1, the possible values of one 
variable, for example, Y, may depend on the value of the other variable 
X. For this reason, it would be inappropriate to analyze the data, say 
for Y (for example, in determining the mean and variance of Y), without 


due consideration of X. In the case of Fig. 7.1, we observe that there . 


is a generaltendency for the values of Y to increase with increasing values 
of X (X may be deterministic or. random). ‘Hence the mean value of Y 
will also increase with increasing values of X; the actual values of Y, of 
course, may not always increase with increasirig values of X. In general, 
the mean value of Y will depend.on the value of X. Suppose that this 


286 . 


7.1. BASIC FORMULATION OF LINEAR REGRESSION zoi 


o. i x 


Figure 7.1 Linear regression analysis of data for two variables 


relationship is linear; that is, 
E(Y|X =2) sate (7.1) 


where e and 8 are constants, and the variance of Y may be independént 
or a function of z. This is known as the linear regression id Y on s Con- 
‘sider first the case with Var(Y | z) = constant. 

Conceivably, there could be many straight lines, depending on the alies 
of a and 8; that might qualify as the mean-value function of Y in the light 
of the data. The “best” line may be the one that passes through the data 
points with the least error. To obtain this, we see from Fig. 1.1 that the 


: difference between each observed value y; and. the straight line-y; = 


a + Br: is | y; — yi’ |. Therefore the line with the least total error can be 
obtained by minimizing the sum of the squared errors—that i is, by min- 
imizing . 


a= Èw) = È U cam 


i=l i=l 


to obtain «æ and 8, where n is the number of data points. This is known as. 
the’ method of least squares, which leads to the following: 


it 

EL È 2n = a = Bu) (> 1) =0 
id pe 

Oeo Mrw-e-e)cu)-0 


sd (Ree cad AR E AUTEN PAIN BD 


From which the least-squares estimates of o and 8 are as follows: 


1 B à 
å = 5 Iy; — È Dz; = 9 — ÊZ 
2 By — = Be = g — Ba (7.2) 


"4 


and 


á- Bayi — ng (ai — 8) (ys — 9) 
Iri ne ^07 o Sa, — mà 


where 2 = Ns 
Therefore the least-squares regression line is 
i E(Y |z) = à + be (7.4) 


Tt should be —" that, strictly speaking, this regression line is valid 
only over the range of values of z for which data had been observed. 
Equations 7.1 and 7.4 are referred to as the regression of Y on X. If X 
‘and Y are both random variables, we may also obtain the least-squares 
regression of X on Y using the same procedure; in this latter ease, we would 
obtain the regression equation for E(X | y). In general, this is a different 
-linear equation from that of E(Y |x); the two regression lines, however, 
always intersect at (z, 9). For example, Meadows et al. (1972) discussed 
the per capita energy’ consumption Y versus the per capita GNP output 
X of different countries. If we are interested in predicting the energy con- 
„sumption for a given GNP output. of a country, a regression analysis of Y 
on X would be appropriate. Alternatively, the GNP output of a country 
may be estimated on the basis of the energy consumption; in this case, the 
regression of X on Y is required (see Problem 7.5). 
Since the general trend is accounted for through the regression line of 


Eq. 7.4, the variance about this line is the measure of dispersion of interest, — 


which is the conditional variance Var(Y | z). For the case where the 
conditional variance Var(Y | 2) is assumed to be constant within the 
range of x of icis. an unbiased estimate of this variance is 


Emu 


Sie = 
ve Gat 


Observe that this is `: 2t 


A 
n-—-2- 


(7.5a) 


2 
SYis = 


Thus the corrésponding conditional standard deviation: is Syts. 


7.4. BASIC FORMULATION OF LINEAR REGRESSION 289 


The coefficients à and B, and sj, are estimates of the respective true ' 
values of a, 8, and Var(Y |x). Confidence intervals may also be estab- 
lished on the basis of available data. For this purpose, if we assume that 
Y has a normal distribution about the regression line E(Y | x) for all values 
of x, then @ and Ê individually follow the i-distribution (Hald, 1952). 
In such a case, the regression values ` 


E(Y |2) = à + fx 


r 


will also be i-distributed. On these bases, the required confidence intervals 


cah be determined. It is worth noting here that these intervals for a, 8, 
E(Y |x), and Var (Y |x) will decrease with increasing n. 

The physical effect. of the linear regression of Y on X-can be measured 
by the reduction of the original variance of Y, sy?, resulting from taking 
into account the general trend’ with X. ‘This reduction is represented by, 


(7.6) 


where 


‘sf = —— r (yi — g* 
iel 
is the sample variance of Y. It will be shown later (n Section T. 5) that 
(for large n) 7 is approximately equal to a point estimate of the correlation 
coefficient. 


Regression of normal variates. The assumptions of linear model and 
constancy of variance’ underlying linear regression are, in fact, inherent 
properties of populations that are jointly normal. We recall from Example 
3.25 that ‘if X and Y are jointly normally distributed, the conditional 
mean and variance of Y given X = z are as follows: i à 


EY | 2) = ny RT (e — x) 


and, 


| Var(¥ |a) = er = p) 

where p is the correlation coefficient. These results mean that if two vari- 
ates are jointly. normal the regression of Y on X is linear with constant 
conditional variance (that is, independent of x); specifically, in this case, 
‘the linear equation is of the form of Eq. e 1 with a 


Bap, 


290 : REGRESSION AND CORRELATION ANALYSES 


‘and, ` f a 
: ` er t fl|ee-mtueooaenci|iv 
a A E. MECEEEFEEEHIE 
. Therefore, if the underlying populations are jointly normal, linear regres- & óooócooóoóó]|o. 
sion should, properly, be used. " 3 
The expressions for a and 8 given above may be compared with the a E 3 
least-squares estimates é and B of Eqs. 7.2 and 7.8 (and its subsequent 2 E Sana 3 2 Eke 
-extension in Eq. 7.23). Also, the Above conditional variance may be com- E l.là8dodoadocsó a 
pared with the corresponding estimate sj, given subsequently in Eq. 7.24. E mpg: tt 1 EN 
` : G=] S a3 s 
: 2 8 sS. 
. EXAMPLE 7.1 s & ST gin 
s . l qe BB 
Tabulated in the first three columns of Table E7.1 are shear strengths, in kips per = + lnannvonaray ale? E 
square foot (ksf), obtained from 10 specimens taken at various depths of a clay o s |ASRESSSQER EP 8 = 
stratum. Determine the mean and variance of the shear strength asa linear function a LPO o. reme Ajlo S 
of depth. Assume that the variance is constant with depth. 8 Pa e i oA 
Table E7.1 summarizes the computations in the regression analysis. i a RO" 
On the basis of the calculations in Table E7.1, the least-squares mean shear " : A A S J 
strength (in ksf) as'a function of depth x is given by a ^ lwewegoarononn | 9 s 5 
l 2 ESEEFEFLEEREFESE: US 
À E(Y |x) = 0.015 + 0.0517x vU j g Al Sdcddnadud |o «| 7 
-- kaal db 
whereas the variance of the shear strength at a given depth is estimated to be 0.0368 3 So Ix 
(ksf, giving sy, = 0.192 ksf. If the linear trend with depth is not taken into a S aj) 
account, the unconditional variance of the shear strength would be 0.197 (ksf)*, and Er: o LILLA 3 g ei 8 È E > Ve UD 
= 0.44 ksf. Hence the conditional standard deviation sy |, is considerably smaller 9 D motum ve a oun 1 E 
T Y la y 3 N I oe it - 
than sy. 8 e y 
The regression equation « obtained above may be used to predict the shear strengih Z ; Sl i 2h i > R 
| B alggesegases [a] Seal S 
U | X[7Y"7"ZZXZRümmTS 8 ale 2S 
Shear Strength, y, ksf E T k D» EJ 5 & 5 € » 
Ln 5 M RSERS S 
o sojig ı 
eo | sie Withee 
È |E | PS -|/S8QSns98ag |E | aal ! S 
B s|gé^loóooóóocccos |a FEL EN m" 
0.015 + 0,0517 x o |a sgj 25 » 
z s pow gm X 
z » 
ie "9 B ay 
x , . 
£ gor co X4 x 00o0ou»eo]lng a i» 
= BE x CR SAMARIA AH | 00 
á a T 
| B 
x 
: g 
9 ` 
a 
l 


Figure E7.1 Regression line for shear strength with depth L 
291 


` 292 REGRESSION AND CORRELATION ANALYSES 


from 6 ft to 30 ft deep. It may not apply to depths beyond 30 ft, unless the linear 
.trend.can be justified beyond this depth on physical ground (for example, the same 
soil type). 
Graphically, the regression line obtained above is shown in Fig. E7.1; also shown 
is the envelope with -Esy,, from the regression line. This represents a band width of 
one (conditional) standard deviation from either side of the regression line. 


Table E7.2. Computational Tableau for Example 72 


Precipita- 
tion Runoff 

x M Ul E , a 

(in.) Gn) xy; x? VG HOt Bee Ji yi Qiy 
1 141 0.52 0.58 1.23 0.343 0.177 0.0313 
2 1.17 0.40 0.47 1.37 0.369 0.031 0.0009 
3 1.79 6.97 1.74 3.20 0.637 0.333 0.1110 
4 5.62 2.92 16.40' 3t 60 2.280 . 0.640 0.4000 
5 1.13 017. | 019 1.28 0.351 —0.181 0.0328 
6 1.54 0.19 0.29 237. -. 0.530 ,—0.340 0.1158 
7 3.19 0.76 2.43 ,10.15 1:245 —0.485 0.2360 
8 1.73 0.66 134 2.99 0.612 0.048 0.0023 
9 2.09. , 0.78 1.63 437. 0.770 0.010 0.0001 
10 2.15 124 3.41 1.55 1.059 0.181 0.0328 
11° 120 0.39 0.47 1.44 0.381 0.009 0.0001 
12 1.01 0.30 0.30 1.02 0.299 0.001 0.0000 
13°: 1.64 0.70 1.15 2.69 0.574 0.126 0.0158 
14 157 , 077 1.234 , 246 0.544 0.226 0.0511 
15 1.54 0.59 0.91 2.37 0.530 0.060 0.0036 
16 :2.09 0.95 1.99. : 4.36 0.770 0.180 0.0326 
17 3.54 , 1,02 3.62 12.55 1.400 —0.380 0.1442 
18 1.17 0.39 0.46 ` 1.37 0.368 0.022 0.0004 
19 ^ 145 0.23 0.26 1.32 0.360 —0.130  . 0.0169 
20 . 2557 0.45 1.16 6.60 0.980 —0.530 0.2810 

21 3.57 1.59 5.66 1274. 1415 0.175 0.0306 , 
22 5.11 . 1.74 8.90 .2618 - 2.084 : 20.344 0.1185 
23. 1.52 | -0.56 0.85 2.31 0.521 0.039 0.0015 
24 2.93 1.12 328 |. 8.58 1.135 —0.015 0.0002 
25 1.16 0.64 0.74 ^ 1.34 0.365 0.275 0.0755 
X 53.89 20.05 | 5924 153,44.: eE Pe  AFI17350 

= _ 53.89 
i= 75 7 2.16 . 5 
20.05 g 
— 7 0.80 


` 59.24 — 25(2.16)(0.80) MC i 
15344 —25(216* ^ 9425 TUE 
= 0.80 — (0.435)(2.16) —.—0.14 | Es 


w 1 


e 


7.1. BASIC FORMULATION OF LINEAR REGRESSION 293 


‘2-014 + 0.435 x 


Runoff, y, in. 


Precipitation, x, in. 


Figure E7.2 Runoff vs. precipitation for Monocacy River basin 


EXAMPLE 7.2 


The precipitation and runoff data for the 25 storms on the Monocacy River, were 
given earlier in Example 5.6. 

(a) Plot the observed data for runoff vs. precipitation. 

(b) Determine and draw the regression line of runoff on precipitation (that is, the 
mean runoff for given value of precipitation). 

(c) Estimate the variance of runoff for a given precipitation. Assume that the 
variance of the runoff is constant with precipitation. 

(d) Assume that the runoff corresponding to a given precipitation is a normal 
variate; what is the probability that the runoff will exceed 2 in. during a storm with 
4-in. precipitation ? 


Solution . 
(a) The plot of data is shown in Fig. E7.2. 
(b) For the regression analysis, see Table E7.2. From the table, we obtain 
E(Y|x) = —0.14 + 0.435x 


It should be emphasized that this regression equation is applicable only within the 
range of the data; in particular, it should not be used for precipitation less than 1 in. 
(c) For given precipitation, the variance of runoff is 


Syp, = V0.075 = 0.274 in. 
(d) When the precipitation is 4 ij in., the mean runoff 


E(Y| X = 4) = — 0.14 + 0.435(4) 
= 1.6 in. 


Therefore the normal distribution for the runoff Y in this storm is N(1.6, 0.274) in. 


294 REGRESSION AND CORRELATION ANALYSES 
Hence 


por >2jx=4 =1- oes) 


0.274 
= 1 — (1.46) 
= 1 — 0.9279 
= 0.0721 


14.2. ` Regression with nonconstant variance. 


Conceivably, the conditional variance about the regression “ine may be.a 
function of the independent (controlled) variable; this would be the éase 
when the scattergram of the data shows a significant variation in the degree 
of scatter with values of the controlled variable. In such cases, the re- 
gression analysis presented in Section 7.1.1 can be modified to take account 
of the variation in the conditional variance. This variation may be ex- 
pressed as ë 


Var(Ylz) = gig'(z) 


where g(x) is a predetermined funtion: and e is an unknown constant. 
Again, for linear regression, 


E(Y|z) =a + fv 


Tn determining the regression —— it would seem reasonable to assume 
that data points in regions of small variance should have more “weight” 
than those in regions of large variance. On this premise, we therefore 
assign weights inversely proportional to the variance; or 

1 LU AT 


~ Var(Y|z) — ego) 


wi 


- Then the squared error is 


=E w (y: — a — Bax)? 


i=l 
from which the least-squares estimates of a and 8 become 
` Zwiyi — BXwa 


Zu; (7.7) 


a= 


and f s 


4 Zwi(Zwgs) — (wyi) (Zwies) 


Zw: (Zw?) = (Swix)? (7.8) 


7.4. BASIC FORMULATION OF LINEAR REGRESSION 295 


where 
1 
gn) 


An unbiased estimate of the unknown øo? is 


w; = ow! = 


Zw;(y; — à EL 


2 
$ n—2 
Hence an estimate of the conditional variance is 
She = sez) (7.9) 
"and . 
: yy, = 8g() (7.9a) 
EXAMPLE 7.3 l 


The maximum settlements and maximum differential settlements of 18 storage 
tanks in Libya have been observed; the data are plotted as shown in Fig. E7.3. 


~ Sy} band 


Moximum Differential Settlement, cm 


o j 2 3 4 


Maximum Settlement, cm 


Figure E7.3 Settlement of tanks on Libyan sand (data after Lambe and Whit- 
man; 1969) 


From Fig. E73, the scatter of the differential settlement appears to increase with 
the maximum settlement. Physically, this would be expected, since the maximum 
differential settlement would ordinarily not exceed the maximum settlement. For 
thése reasons, the conditional standard deviation: of the differential settlement Y 
may be assumed to increase linearly with the maximum settlement X, or 


Var (Y| x) = otx? 


296 REGRESSION AND CORRELATION ANALYSES 


Table E7.3. Computational Tableau for Example 7.3 


Maxi- 
mum 
Maxi- differ- 
mum ential 
settle- settle- 


i Xi Ji Wi WX; Wii WiXyi Wx? wii — & — Bx)? 
1 0.3 02 1141 333 222 067 1.0 0.0178 
2 0.7 0.7 2.04 143 143 100 10 | 0.0816 
3 0.8 0.5 156 125 0.78 062 1.0 0.0066 
4 0.8 1.1 1.56 125 1.72 137 10 0.4465 
5 0.9 0.3 1.23 111 0.37 033 10 0.1339 
6 1.0 0.6 1.00 1.00 060 060 1.0 0.0090 
7 1. 0.6 0.83 0.91 0.50 0.55 1.0 0.0212 
8 14 1.0 051 071 051 071 10 0.0010 
9 1.5 1.0. 0.44 0.67 .0.44 0,666 ^ 1.0 0.0002 
10^ 1.6 1.0 0.39 0.63 0.39 0.62 10 0.0028 
11 1.6 1.3 0.39 0.63. 0.51 081 1.0 0.0180 
12 2.0 1.5 0.25 0.50 0.38 075 10 0.0060 
13 24 1.3 0.17 0.2 022 0.53 1.0 0.0158 
14 2.6 23 0.15 0.38 035 0.90 1.0 0.0479 
15 2.9 1.9 0.12 0.34 023 0.66 1.0 0.0001 
16 2.9 23 0.12 0.34 028: 0.80 1.0 0.0164 
17 3.7 1.7 0.07 027 0.12 044, 1.0 0.0394 
18 L5 0.6 044 0.67 0.26 040 10 0.0776 
x 22.38 15.84 11.31- 1242 18.0 0.9418 
f= (22.38)(12.42) — (11.31)(15.84) 0.65 
(22.38)(18) — (15.84)? ° 
.,,1131 — (0.65)(15.84) _ 
sepa aOR 
0.9418 - 
2 =- aa 
T= 76 0.0589 
Sy, = 0.243x 
Thus 
peck 
15 


` The required computations are summarized in Table E7.3, from which we obtain 
the regression equation for estimating the expected maximum differential settlement 
Y (in centimeters) on the basis of information for the maximum settlement X of the 
tanks (in cm) as 

ECY | x) = 0.045 + 0.65x 


7.3. MULTIPLE LINEAR REGRESSION 297 


The corresponding standard deviation is ` 
Sy|; = 0.243x 


4.2. MULTIPLE LINEAR REGRESSION 


The valüe of an engineering variable may deperid on.several factors. In 
such cases, the mean and variance of the dependent variable will be a 
function of the values of several variables. When the mean-value function 
is assumed to be linear, the resulting analysis is known as. multiple linear 


regression. 

Linear regression analysis for more ‘than two variables is simply a 
generalization of the regression ‘analysis discussed in Section 7.1. Suppose 
that the dependent variable of interest is Y, and that it is a linear function 
of m variables Xi, Xs, «:. . , Xm. The assumptions underlying. multiple 
regression analysis are as follows. 


1. The mean value of Y is a linear function of tı, 25, . . . , 24; that is, 
E(Y | 2j... Em) = Bo. Bitr bs Bits . (710) 


where £s, fi, . . . , Bm are constants, to be determined from observed data. 
2. ` The conditional variance of Y given zi, ** *, Xm is constant; that is, 


CVar(Y | a, ..., Em) = o? 
or proportional to a given function of 2, >- e , Xm}; that is, 
Ver(Y |j ..., Em) = o*g?(m, ..., tm) 


The regression analysis then determines estimates for Bo, 81, . .'. , Bm and 
a? based on a set of observed data (zu, tzi... , Umi, Vi), t= 1... 
Equation 7.10 can be written also as 


BUY | m... m) mad hn mA) He b Balan — En), (7.11) 


in which the zs are the sample means of X; and a is simply a readjusted 
constant. 

Again we restrict our ‘derivation for the case in which the conditional 
variance Var(Y | a4, ..., 2m) is constant. The sum of squared errors for a 


set of n data points, then, is 


= MU cu» 
i 


Siow ts s mL n (a2) 


i=l 


; By the least-squares criterion, we minimize. A? to obtain the ‘following set 


298 _ REGRESSION AND CORRELATION ANALYSES 


of equations for determining the estimates of à and 8, j = 1,2, ..., m: 
aA ^ ^ 
Ja 7? > Ly: — 8 — Bilan — &) — vee — Bu(Smi — En) ] = 0 
oA? ^ 
aa ^? È y = à — Biles — 2) iru j 
— Bs (zo: — Ëm) (zu — H1)} = 0 
: ^ (7.13) 
aa? ; ^ 
0B. 2 $i—-&—- Alex z) —-- 
` — Ên (Emi — Em) (Emi — Em) } = 0 
where Z = $75,. From the first of these equations, we have 
Zy: — nà — B E(m; — d) — i — Ên Elimi — Em) = 0 
but Z(z; — 4) = .— Z(z. — En) = 0; thus 
; . zy; 
d= Z 2g 
a j (7.14) 


Substituting this value of & into the remaining equations in Eq. 7.13, 
we obtain i 


B, Z(m; — 1)? + Ês E(n; — 1) (ta — m) 
+ Bm Z(n; — à) (Tmi — Em) = Z(m — 3) (ys — g) 
: (7.15) 
B; Z(.; — Em) (ars — B) + Âe Zoe — Em) (Bai — Ha) Hove 
+ Ên 2 (mi — En)? = Z(zsi— Em) (ys — 8) 


It can be observed that this represents a set of m linear simultaneous equa- 


_ tions involving the m unknowns f, : .. , Ên. The solution of Eq. 7.15 yields 


the required coefficients B,.. 
regression equation 


BQY|2,...2) = @+ Br (m — 8) + «ene — 2) 
= Êot mtot ntn . (7.16) 


a [ from: which we obtain the least-squares 


where Aer 
bo = @— Ba ee — Bn Ën 


The. variance of Y about this mean-value ` function, namely, . 
Var(Y | a, .. 


- tm); is a measure of the conditional dispersion about the 


‘7.2. MULTIPLE LINEAR REGRESSION | 299 


regression equation. An unbiased estimate of this conditional variance is 
A 


Sharen T pm l^ 


: T m t : 
-_ Zin — 8— B (m —8) — +++ — Bm (Eni — Gn) F 


n—-m—l1 


(7.17) 


The corresponding conditional standard deviation, therefore, is 
A 
S¥I21...2m = i= mS 


Obviously, Eq. 7.17 is valid only if the sample size n is larger than m + 1.- 


(7.172) 


-' Again, the assumption of normal distribution for Y may be invoked for the 


pürpose of establishing confidence intervals on the regression coefficients. 
and for computing probabilities associated with the random variable Y. 


n 


EXAMPLE 7.4 


An important factor in the prediction of frost depth for highway pavement design 
is the mean arinual température for the site under consideration. The mean annual 
temperature -records at, 10 different weather stations in West Virginia are sum- 
marized in Table E7.4a. 

Since a pavement may be constructed in various locations over the state, where 
temperature records may not be available, it is desired to predict the mean annual 
temperature of a locality on the basis of its elevation and latitude, using the in- 
formation in Table E7.4a. The foll ing equation is assumed: 


. ECY | xy, x9) = Po + Bixi + Boxe 
where i A 
Y = mean annual temperature, in °F 
x, = elevation, in feet, above sea level 
~ xa = north latitude, in degrees: 


Determine estimates for f, fi, and fs and evaluate the conditional variance 
Var (Y | xy, x2), which is assumed to be constant. Table E7.4b summarizes the 
computations required in the multiple linear regression analysis. From these results, 
the mean annual temperature for a locality in West Virginia, with elevation x, and 
latitude xz, is given by ` i 


ECY | y, xy) = 121.3 — 0.0034x, — 1.65%, 


whereas the standard deviation syj,,.,, of the mean annual temperature at any 
locality is estimated to be V0.547 = 0.74^F. It may be observed from Table E7.4b 
that by taking the elevation and latitude of a locality into account in estimating its 
mean annual temperature, the variance of Y is reduced by 94.596 of the uncon- 
ditional variance. i * i 

At Gary, West Virginia, which is located at an elevation of 1426 ft and latitude of 


300 REGRESSION AND CORRELATION ANALYSES . 


Table E7.4z. Mean Annua] Temperature in West Virginia—Data From 
Moulton and Schaub (1969) ` 


Elevation North latitude Mean annual 
Weather stations (f | (deg) temperature (CF) 
Bayard ; 2375 39.27 | 45 
Buckhannon 1459 39.00 “52.3 
Charleston : 604 | : 3835 > 56.8 
Flat Top - 3242 3758 . 48.4 
Kearneysville 550 39.38 54.2 
Madison : 675 . 38.05 55.1 
New Martinsville ' | 635 39.65 544 
Pickens. 2721 38.66 48:8 
Rainele  . 2424 37.97 50.5 
Wheeling - 659 40.10 52.7 


37.37°N, the expected mean annual temperature would be 


E(Y | 1426, 37.37) = 121.3 — 0.0034 x 1426 — 1.65 x 37.37 
= 54.80°F 
and a conditional standard deviation of 0.74°F. If the mean annual temperature is 


assumed to be Gaussian, the 10-percentile value of the mean annual temperature y., 
is determined as follows: ] 


f _ ofya — 548) — 
PUY < ya) = bea =01 
or : 

Ja = 54.8 + 0.740701) 
= 539° 


7.3. NONLINEAR REGRESSION 


' Relationships between engineering variables are not always linear, or may 
not always be adequately described by linear models. Experimental data 
for such Variables may show a nonlinear trend between the observed 
values of the variables. For example, Fig. 7.2 shows a plot of the average 
all-day parking cost in a central business district’ versus the urban popu- 
lation for various cities in the United States. Similarly, data for the average - 
dissolved oxygen (DO) in a pool measured at various temperatures are 
shown in Fig. 7.3. Although a linear relation may be used to describe the 
general trend between each pair of variables, predictions based on such 
linear relationships may overestimate (in certain ranges of the variables) or 

. underestimate (in other ranges of the variables) the expected result. „For 

example, the linear regression line in Fig. 7.3 will underestimate the average 

DO at temperatures between 23° and 24°C, but may overestimate the 


Qi- WP 
3.83 


0.945 


f= Py + 
fiaxs 


p» ut 


3.76 


x (yi — 


(xo; ~ Ža) 


—27i 


x (yi -» 


(xu — X) 
35 4- 1.65 x 38.80 


276] = 1.65 


—27171 


> 


52.07 + 0.0034 x 15 


= 121.3 


—4019 


X (xy — X9) 
f 


Ga; — 49 
43520000 


9991000 


det [ —4019 


5.973 


o = 


(zi — F)? 


2 


& 


Ga. 7 X 
x 108 
` 9991 


Table E7.4b Computational Tableau for Example 7.4 - 


Mean 
tempera- 
ture 
520.7 


tude 
Xei 


Lati- 
15350 388.01 


tion 
Xu 


x 


i 


tion Eleva- 


Sta- 
no. 


0.5 


Parking Cost, dollars /day 


:0 


(000 ' 2000. 3000 


C 400 
~ Urban Population -( in 1000) i 


Figure 7.2 Average all-day off-stre i f 
St " : , 
(data En UD y off-street parking rates in central business district 


Average Pool DO, mg/f . 


` Figure 1.3 Average 
Evans, 1970) 


“302. 


26 -eT 28 ^ 


Average Poo! Temperature, T, °C 


pool DO and temperature (data from Butts; Schnepper, and 


7.8. NONLINEAR REGRESSION 303 


average DO between 24° and 26°C. In such cases, a nonlinear relationship 
between the variables would be more appropriate. The determination of 
such nonlinear relationships on the basis of observational data involves 


nonlinear regression analysis. . 
Nonlinear regression: is usually based on an assumed nonlinear (mean- 


. value) function with certain undetermined coefficients that would be 


evaluated from the experimental data. The simplest type of nonlinear 


. function for the regression of Y on X is 


E(Y|z) = a + 892) (7.18) 


where g(x). is a predetermined nonlinear function of z. For example, 
g(x) may bez + 2°, e", In z, or any other function of x. Finally, nonlinear 


'regression analysis is usually based on the assumption of a constant. 


variance Var (Y |x), or a variance that is a function of g(x). 
By defining a new variable z' = g(x), Eq. 7 .18 becomes 


BW (2) =a+ pe! 5. (7.19) 


which is of the same mathematical form as the linear regression equation 
of Eq. 7.1. If the observed data pair (z; yi) is also transformed to 
[o(2s), ys] or (2/', ys); the original problem of nonlinear regression between. 
z and y is thus converted to a linear regression between the variables x 
and y. The corresponding regression coefficients a and £ and Var(Y |) 
can then be estimated from Eas. 7.2, 7.3, and 7.5, respectively. 


. EXAMPLE 7.5 


The average all-day parking cost in the central business district of United States 


. cities may be expressed in terms of the logarithm of the urban population; that is, . ` 


modeled with the foliowing nonlinear regression equation: 
EY|»- & + Blnx 
with a constant Var (Y | x), where . . 
Y — average all-day parking cost (in dollars) : 
x = urban population (in thousands) 


Determine the estimates for «, 8, and Var (Y | x) on the basis of the'observed data 


referred to in Fig. 7.2 and given in Table E7.5. ^ : 
The required computations for the regression analysis are summarized in Table 


. E7.5; from these results we obtain the mean-value function 


ECY | x) = —0:773 + 0.244 in x 
and ` 
ane Si, = 0.013 


or . 
Sy = 011. 


7 900 = E(ILLO)STE — 08'6] E = 24s 
ELLO- = £€'9 X vrCO — 1LLO = 9 
«(£€9) x $1 — 8'zI9 


P E 


305 


gos [su QUO Tox ceo x st so» Ë 
£100 . 5 B 
TTO = £100, = “4s — 22 MEOS Tae 
o ELS ag ae : ga SL. y 
£100 = “Sarg = "8 20 0889— eA 
` 9910 = ev ' 086 Enc so's} Lé't6 ma 
zio £I STI yz LU6 6r8. ODIT 
96070 - IZI 961 9°59 HETE ors orl 
£20°0 LOT s80 -TLS : S69 OSL 760 
£00°0 18°0 s80 tsp 619 £L'9 760 
90070 £L'O 99'0 OLE 86v sro 180: 
z100 690 - v9'0 6'SE 6L? - 66S ' O80 
620'0 £9'0 90 - 61€ 69v LS 08'0 6 
00070 SLO 990 . 6€ 697 seco ^" SLO 8 
00070 89'0 8r0 £st Or? v6'S 69'0 L 
00070 $90  . spo ore 16€ es | — 190 9 
*IO'0 TLO 9€0 OLE 89'€ £r9 09'0 s 
£00°0 £90 £0 eee see” LLS 890 ? 
: $000 090 870 els LEZ 09's. eso £ 
£z0'0 £9'0 ETO ` ETE SLT HLS sp0 z 
0000 1S0 Sco Suc TIT se's oso I 
QC HQ "oM E E: Mx "x upe jx « Kat 


[5] 


SIAS JUP OT. g pue » eujuoj9p oL 


S'L ejdurexq 107 nege L qeuoreynduoz) 'S'L3 NWL 


4000 


4000 
3000 


2000 


Urban Population (in 1000) 


1000 


2000 . 
` Urban ‘Population Cin 1000) 


500 
(b) tn Arithmetic: Plot; 


(a).In Semi-log Plot 


1000 


100 


sJo|lop t4509 Bulysog saojiop *isog Bupuog 


Figure E7.5° Average all-day parking eost vs. urban population.- (a) In semilog 


. plot. (b) In arithmetic plot. (Data from Wynn, 1969): 


304 


306 REGRESSION AND CORRELATION ANALYSES 


Figures E7.5a and E7.56 show this regression curve in semi-logarithmi¢ and 
` arithmetic scales, respectively. 
EXAMPLE 7.6 


An exponential model may be used for predicting the average DO concentration 
from the average pool temperature 7 that is, 


DO = ue’? 


Estimate the coefficients « and f based on the data portrayed in Fig. 7.3. 
_ Taking the logarithm on both sides of the equation given above, | we have 


In DO = Ina — £T 


It may be observed that the right side of this equation is a lineat.function of T. In- ` 


troducing the variables, 
= In DÖ 
x-2-T 


the nonlinear problem, therefore, is reduced to that of a linear regression. That is, 


Eln DO |7) = Ine — 8T — 
or 
E(Y|x) =Ina + fx 


In this case, therefore, the original data are first converted from (DO, T)to (In DO,, 
— Tj) and are then used in the linear regression analysis. On this basis, the E aum 


Tp 50969 4123 T 
` r -083 


-Avdrage Pool DO, DO, mg/A 


95 76 Confidence Limits 


23 24 25 26 2 28 29 


. Average Pool Temperature , s ,*C 


. Figure E7.6 Weighted average pool DO and temperature relationship {after 
Butts, Schnepper, and Evans, 1970) 


: 7.4. APPLICATIONS OF REGRESSION ANALYSIS IN ENGINEERING 307 


coefficients are estimated to be 
Jn & 273.93 and. f — 0.1123 

Heice the s exponential model is obtained as 1 
. In DO = 3.93 — 0.11237 
or 

DO: = gi99-9-1:5T — 50,9g-0-11237 

.This regression equation and the associated 95% confidence interval’ ‘are showni in 

Fig. E7.6. i y 


The form of nonlijeak functións assumed in | Eq. 7.18 can be generalized 
as follows: . 


E(Y |o) = a + Bun (o) + bgla) pud: + Bafa) . (20) 


where g;(z),j = 1, 2/...,m are predetermined. functions of the inde- 
pendent variable z. An SERIA of Eq. 7.20 is the following general poly- 
nomial relation: 


E(Y | d) = è + Bis + Bast + + + fam . (7.21) 
We now observe that by the ‘conversion z; = g;{x), Eq. 7. 20 becomes. 
- E(Y |) = at Bad tee + Btn 


' Hence, by’ considgring éach of the functions g;(z;) evaluated from the 
. original data set z;, the nonlinear problem of Eq. 7.21 is reduced to that of 


a multiple linear regression, presented earlier in Section 7.2, 


1.4. APPLICATIONS. OF REGRESSION ANALYSIS IN 
ENGINEERING 


Regression. analyses have been used widely in practically all branches of 
engineering -for obtaining empirical relations between two (or more) 
variables. Sometimes the necessary relationship between two engineering 
variables cannot'be derived on the basis of theoretical considerations; in 
these cases the required relationship may be determined empirically on the 
basis of experimental observations. For example, by plotting the logarithm 
of the observed fatigue life N of a material versus the logarithm of the 
applied stress range S, a linear trend is observed as’ shown in Fig. K 4. 
This trend c can be represented by 


log N ='a— b Íog S 


Linear regression of log N on log S would then yield the constants a and 
b. This equation regression also suggests an S-N relation of the form. 


NS =a 


cy 


3 


- Stress Ronge' S, ksi 


- — ——— 
: Materia! : A7, A36, A373 Mild 
tress Cycle : Zero to Tension 


t 


100 200 300 500 700 1000 2000 3000 
Cycles To Failure N, in 1000 


Figure 7.4 S-N relation for fatigue of material (data courtesy of W. H. Munse) 


; Gradient Of Leost-Squares Line = 0,53 


Q/, 10° peu / hour 


MN 2 5 10 20 50 100 200 500 g 
Ay i08 sq, ft. i 


Figure 7.5 Peak traffic flow into city center with area A (after Miller, 1970) 


* 308 


7.4. APPLICATIONS OF REGRESSION ANALYSIS IN ENGINEERING 309 


O Test Results On s 
Mexico City Clay by Rutledge 


Compression Index, C 


[7] 2 4 6, 8 10 
Void Ratlo, e è 


Figure 7.6 Compression index vs. void ratio (after Nishida, 1956) 


In other situations the mathematical form of a required relationship, may 
be derived or postulated from physical considerations; regression analysis 
may then be used to determine the values of the parameters, or to ,B8Sess 
the validity of the theoretical equation, on the basis of observational data. 

For 'example, Smeed (1968) postulated ‘that the peak flow ‘of traffic | 
(in passenger car unit, peu) into the center of a city i is 


d Q = of AM? 


where f is the fraction of the city center that is occupied by roadways; A 
„is the area of the city center in square feet; and o is'a constant depending ` 
on the speed of the traffic and the efficiency of the road system. Basically, 
this equation is based on the hypothesis that the volume of traffic that can. 
enter the central area is proportional to its circumference. Data from 35) : 
eities, including 20 from Britain, aré shown plotted on log-log paper in 
Fig. 7.5. Least-squares linear regression of log(Q/, f) on log A yields a slope 
of 0.58 for the regression line; Smeed's equation would be equivalent to a 
slope of 0.5. Also, from the regression line of Fig. 7.5, the constant a can be 
determined as the value of Q/f at A — 

Some engineering variables can be measured more readily and econom- 
ically than others; for example, initial void ratios of clay samples can be 


310 REGRESSION AND CORRELATION ANALYSES 
CEPS kg/cm? . 
cod 100 200 300 —— 400 500 
s 600 


i 


Total -Number Of Test Results z 396 500 


~ 
t 

* 

a 

£ 

EJ 

È 

£ 

a 

Correlation 

2 Coefficient =0.869 400 

3 = 180! + 1.286X “z 
1 Standard Deviation -348ps $ 
Hn e 
E 300 * 
= 
8 

a 

A 

E i 
d : 200 
= £15 % Limits , 

100 


© 2000 4000. 6000 
Accelerated Strength, psi- 


Figure 7.7 Relationship of accelerated to 28-day strength; combined field data ` 


from nine jobs across Canada (after Malhotra and Zoldners, 1969) 


inexpensively measured in the laboratory whereas the direct ' determi- 
nàtion of the compression index may require considerable labor and time. 
Consequently, if an empirical relation is.established between the void ratio 
and the compression index of soils, such as the relation shown in Fig. 7.6, 
we can simply measure the void ratio and predict the compression index 
by using the regression equation. Another example is the determination of 
concrete strength. Nornially the compressive strength of concrete specimens 


is tested after 28 days of curing. At the present rate of construction, 28. 


days would be a relatively long period; methods of early determination of 
concrete stréngth have been suggested, such as using an accelerated 
Strength (based on an accelerated curing process). Figure 7.7 shows the 
‘results obtained by Malhotra and Zoldners (1969), which indicate a 
linear calibration between the two strengths. In traffic.engineering, Heath- 
ington’ and, Tuft (1971).have also succeeded in linearly calibrating the 


short-interval traffic volume with long-interval traffic volume in six Texas _ 


cities; some of these results "^ own in Fig. 7.8. 


G. 10 zZ 
3$ 9 E 
a = 
= Bf- Y-ihr Inbound Peok Volume s. 
En |. X*5min Inbound Peak Volume, en 
- g9 T f $o 
ELEME ; Bx 
$5 s 55 
$$ a4 23 
Se T$ 
^a 3 58 
ae i 
> 
2 7! : 
2 $ 
ó 0 - È - 
O 140 280 420 560 700 0 4 28 42 56 70 
Five-Minute-Peak Volume In inbound Orè Hour Peak Volume In jnbound 
; Direction Direction x 10% i 
Zo 10 $ 20 
Fe 9 $ ae 
2 x a s 
a5 e ^ '6[-- Y -24 hr-Vol, In Both Directions 
237 B. 14 X= Thr Vol. In Both Directions 
oy so : 
ax 6 $97 1 "c 
c9 m 
sÈ 5 52 0| 
5 
Ez 4 Tg 8 
28 5$ 
$2-s e$ 6 
o 
ae 2 BU. 
BE 1 5 
$3 | H 2l 
--.0- 160 320 480 640 800 ^ '. Üo. 2 4 — e 10 


Five-Minute Volume In Both Directions 


During Inbound Peok Five Minutes Mri Votume In Both Oipgetlons Dering 


Inbound Peak Hour x |. 
Figure 7 7.8 Relations between traffic volumes for different directions, in 6 Texas 
counties (after Heathington and Tutt, 1971) | . 


€ linSlump Concrete ` 
O Gin. Stump Concrete 


Stress, % ultimate 


Cycles To Failure 


Figure7.9 S-N diagram for concrete beams (after Murdock and Kesler, 1958) 


311 


Table 7.1. . Multiple Regression for Estimating Trip Generation - 


Independerit Regression. . aces 
variables equations Syjay +++ 2 zn ers 
Xy, Xo, Xs, Xi Y -433-.-389X, | — 0.87 0.837. 
; — 0.005 X2 
— 0.128 X; i 
(7 — 0.012 X; 
‘Xi, Xë Y -380-379X: |^ 0.87. - 0.835: 
; are. — 0.003 X, . 
Xo, Xi Y - 549 . 1.0 0.764 
i — 0.0089 X> : 
er 0.227 X, 
Xo c Y = 2.88 + 4.00 X; 0.89 0.827. 
X, . y72 1.10 0.718 
í — 0.013 X; 
X, Y = 807 4-044 X, 1.20 , 0.655 
Xs u Y = 3.55:+ 0.74 Xi 1.30. us 0:575 


Y= Expected | number of residents trips per dwelling unit. 
X= Automobile ownership (no. per dwelling unit) 

X; = Population density (no. per net residential aére) 

X; = Distance from central business district (miles) 

X, = Family income (thousand dollars) 


Miles Of Travel 


ie 


9o "20 30 50 100 500 1000 5000 10,000 
i City Population In Thousands 


Fi igure 7.10 Work trip distance by city size (after Voorhees, 1966) 


312 


7.4. APPLICATIONS OF REGRESSION ANALYSIS IN ENGINEERING 313 


ui Scioto R. Above Chillicothe, Ohio 
Potomac R. Above Washington, D.C. 
Sabine-R. Above Orange, Texas 
' Vistulo R. Above Gdansk, Poland 
Hudson R. Above Troy, New York 
Deieware R. Above Trenton, New Jersey. 
Apalochicola- Chattohoochee R. Above 
Chattahoochee, Florida 4 
Allegheny R. Above Natrona, Penn. ` 
Mississippi R. Above Winona, Minn, 


obOo4«m»e 


D«a 


Nd Xo * 200 miles . 
S Qo = Flow At 200 mlle Station 
x 
2 
uv 
© 
= 
s 
|] 
[2 
Ol 
0.01 E 
0.02 Ol I 5 


Relative Distance, x/x, 


Figure 7.11 River flow vs. distance downstream (after Shull and Gloyna, 1969) 


Multiple linear regression also finds many applications; for example, 
Martin et al. (1963) used multiple linear regression to obtain the expected 
number of trip generations Y per dwelling unit in a community as a fune- 
tion of automobile ownership Xi, population density Xo, distance from thè 
central business district X; and family income Xi: 


= 4.33 4 3.89 Xj — 0.005 X; — 0.128 X; 
— 0.012 X, 


Other multiple linear regression analyses were also performed for Y based 
on fewer independent: variables. The resulis of these analyses are sum- 
marized in Table 7.1. It ean be observed from the values of r that the 
proportion of the Variance reduced by taking account of the linear trend 
generally increases with the number of variables included in the regression. 


analysis. 


Maximum Sustained Wind Speed 


0 - 20 30 40 60 . 100 150.200 SM. 


Radiol Distance’ From Center 


Figure 7 " Surfaca hurricane wind profile (after Goldman and Ushijims, 197) 


Speed, mi les/hour ` 


©..,10 .20 30 40.50 60 70 80 90 .i00 410 120 - 
Density, vehicles/mile/lane, ` 


; Figure 7. 13 Speed-density relationship (after Payne, 1978) 


314 


7.5. CORRELATION ANALYSIS 315 


- Nonlinear regression. is also widely used in engineering. Aside from those 


described earlier in Examples 7.5 and 7.6, Fig. 7.9 shows an application 
involving.the logarithmic transformation, in which the average stress pei 
cycle of repeated loading is plotted against the logarithm of the number 
of cycles to failure of concrete beams. This is anothér example of the 
S-N diagram for.the average fatigue life. In this case, because of large 
variability in concrete strengths, & wide, scatter is observed. Figure 7.10 
shows that the average distance of travel to work may be linearly related: 
to the logarithm of the city population; as the population in a city increases, 


. the city spreads out to the suburbs and the average distance of travel to 


work also increases. An example of a double logarithmic transformation is 
shown in Fig. 7.11, where the logarithm of river flow increases linearly with 
the logarithm of distance downstream; similarly, Fig. 7.12 shows that the 
maximum sustained wind speed and the radial distance from the center of 
a hurricane also follow approximately a log-log relationship. 

Polynomial functions are also often used in nonlinear regression. A, 
third-degree polynomial curve is shown in Fig. 7.13 to describe the mean ' 
vehicle speed as a function of the traffic density: s 


7.5. CORRELATION ANALYSIS. 


7.5.1. Estimation of correlation coefficient 


The study of the degree of linear interrelation between random variables is 
ealled correlation analysis. Recall that in regression analysis, we are in- 
terested in predicting the value of a variable (or estimating associated 
probability) for-given values of the other variables. However, the accuracy. 
of a linear prediction will depend on the correlation between the variables. 

Mathematically, the correlation between two random variables X and 
Y is measured by the correlation coefficient defined in Eq. 3. 73 as 


_ Cov (X, Y) _ ENX — ux)(¥-— aD) 


[5452 OxOY 


Based on a set of observed values of X and Y, the correlation coefficient 


may be estimated by 


PERSDPXMCEDICEDEI Mmi S (y 


n—i1 UO Sasy n-—i $.8y 


where d, F, Ser and s, are, respectively, the sample means and sample 


standard deviations of X and Y..The value of $ also ranges from —1 to +1 


. and is a measure of the strength of linear relàtionship between the two 


variables-X and Y. If the estimate ĝ is close to +1 or —1, there is strong. 
linear relationship between X and Y, and linear regression analysis may be 


9210 


Modulus Of Rupture, psi 


le |4 t6 18 20 22 24 26 28 
Modulus Of Elasticity, psi x 107 


Figure 7.14 Comparison of MOE and MOR: values of 40-ft laminated beams 
(from Galligan and Snodgrass, 1970) 


10 


carried out to obtain the regression equations. On the other hand, if 5 œ 0, 
this would indicate a lack of linear relationship between the variables; 
such a case is illustrated in Fig.’7.14 between the modulus of rupture and: 
the modulus of elasticity of laminated wood beams. 

` From Eqs. 7.3. and 7. 22 it can be shown that 


220-809 | se 
E Elz: —£) . .s, 


= p— . s (7.23) 


This i is a useful relationship “between the estimate ot; p and the 3 regression 
coefficient B. Tarthermore; by substituting Eq. 7.23 into Eq. 7.5, we obtain 


Pa. lz) = 2 S x(s, - Z2 


e p- 


: —1 ., j PE 
e a uo (7.24) 


7.5. CORRELATION ANALYSIS 317 


from which we also have 


n —2 sh, 
l nod (7.25) 
which is equal to 7? of Eq. 7.6 for large n. On this basis, therefore, we can 
say that the larger the value of | ô | the greater will be the reduction in the 
variance when the trend between the-variables is taken into account, and 
hence the more accurate will be the prediction based: on the regression 
equation. 


EXAMPLE 7.7 


In Table E7.7 are shown the data on blow counts N; and corresponding uncon- 
fined compressive strength of very stiff clay g;. These data are also shown in Fig. 
E7.7. 

On the basis of these data, estimate the correlation coefficient f between the blow 
count and the unconfined strength of stiff clay. 

The required calculations are indicated and summarized in Table E7.7. From 
these results, we estimate the correlation coefficient using Eq. 7.22 to be 


, , 3492.77 — 10018.0.12] 
á 9585 V134 


This indicates that there is very high correlation between blow counts and the un- 


= 0.98 


Table E7.7. Computations for Example 73 : 


i ` ».Compressive - 
Blow counts "strength (tsf) ° . . 
N; 9i N? A q? i Nigi n . 
4 0.33 16 0.11 132 “ 
8 0.90 64 0.81 7.20 
11 1.41 5 126 1.99 15.51 
16 1.99 256. 3.96 31.84 
17 1.70 289 . 2.89 28.90 
19 2.25 361 5.06 42.75 
21 2.60 441 6.76 54.60 
25 2.71 625 7.34 67.75 
32 3.33 1024 : 11.09 106.56 
34 4.01 1156 16.08 136.34 
187 21.23 4358 56.09 492.77 
187 . 21.23 $ 
R= 4o 7187 " aS 722 


Sy? = 314358 — 10(18.7*] = 95.65; s? = $156.09 — 1002.12} = 1. 24 


318 REGRESSION AND CORRELATION ANALYSES 


" 
2 

z 

Li ES 

o 4 ui o 
= 

o 

5 Ta 
= o 
D 3 

Bo 

a o 

g o 

1 o 

E 2 o 

8 o 

3 o 

H 
= 

E i : 

8 o 

č 

EI 

5 ~ 
F 


N, blow counts per foot of penetrat|on 
"Figure E7.7 , Unconfined compressive strength vs. blow counts for stiff clay 


confined compressive;strength Of stiff cláy: on this basis, therefore, ow 
ive; e, the blów count 
may be used to estimate the unconfined Strength of stiff clay. i Pe T 


EXAMPLE 7.8 - or ge xU 


For the data recorded on the Monocacy River (described earlier in Examples 5.8 
and 7.2), estimate the correlation coefficient between runoff and precipitation. : 

Based on the computations tabulated earlier for Example 5.8, we obtain, thé 
sample variance of precipitation s,? —.1.53; and the sample variance of runoff s,2 = 
0.36. From the calculations in Example. 7.2, we also have > E 


25 hao d " 
2 Jy; — 253p = 59.24 —.25(2:16)(0.80) = 16.04 
Hence sei p VG 
. S 1/24)(16.04 pw 
UAUA _ 9 64 


pee = 
VYS3VOS36 | - 


The corrélation coefficient is required when calculating the joint prob- 
abilities of two or more random ‘variables that are jointly normal (see 
Example’3.25). However, for ndii-normal variates the quantitative role of 
the correlation coefficient in the computation of joint probabilities is 
seldom defined. Nevertheless, the correlation: coefficient is ‘a. measure of 


‘inear interdependency between two random Variables irrespective of their 


` listributions. . à 


PROBLEMS 319 


Multiple correlation. When more than two random variables are in- - 
volved, as in the case of multiple linear regression of Eq. 7.16, any pair of 
variables may be mutually correlated, for example, between X and X;, or 
between Y and X;; the corresponding correlation coefficients are 


A Xi — ux) (Xj — axi. : 
PX: X; = EEX; 7 ux) (Xj — nx); eu : ux J (7.25) 
m OX; i n 
and can be estimated as . 
z 2 1: (Dh tats + nid) "pd 
ETT o: Sz;Sz;' Be des (7:20) 


7.5. CONCLUDING REMARKS 


The statistical method for determining the:méan and variance of one 
random variable as a function of the. values of other variables is known ` 
as regression analysis. On the basis of the least-squares criterion, regression 
analysis provides a systematic approach for the empirical determination 
of the underlying relationships among the random variables. Furthermore, 
the associated corrélation analysis determines the degree of linear inter- 
relationship between the variables (in terms of the correlation coefficient) ; 
a high correlation means the existence of a strong linear relationship be- 
tween the variables, whereas alow correlation would mean the lack of” 
linear relationship (however, there could be a ‘nonlinear relationship). 
Regression and correlation analyses have applications in many areas of 
engineering, and are especially significant in situations where the necessary 
relationships must be developed empirically. i 


PROBLEMS 


.7.A Assume hypothetically that the concentration of dissolved solids and the 
turbidity of a stream are measured simultaneously for five separate days, 
selected at random throughout a year. The data are as follows. 


Dissolved solids ‘Turbidity " 


Day (mg/l) QTU) 
wl 400 _ 5 

72 550 30 
3 700 32 
4. 800- 2.38. 
5 500 20: 


RARE EU [31 Y A24 BOOED 


Because turbidity is easier to measure, a Tegression equation may be used to 
Predict the concentration of dissolved solids on the basis of known turbidity. 
Assume that the variance of dissolved solid concentration is constant with 
turbidity. ; A P : S : 
(a) What are the values of the intercept and slope parameters (« and £) of 
the regression line? Ans. 364.7; 7.79. o E. 
(b) Estimate the standard deviation of dissolved solid concentration about 
the regression line. Ans. 58.8. 


7.2 Suppose that data on the consumption of water per capita per day have been 


© collected for four towns in the Midwest and tabulated as follows (see also 
Fig. P7.2). 


x Per capita water 
: Population consumption (in 
Town (in 104) 100 gal/day) 
1 1.0 1.0 
2 4.0 L3 
3 6.0 1.3 
4 90 . 14 


(a) If the effect of population size of a town on the per capita consumption 
. is neglected, determine the sample variance 5,2. ae 
(b) From the observed data, there seems to be a general trend that the per 


capita water.consumption increases with the population of the town. 
Suppose it is assumed that 


ERY |x) =a + Bx 


and Var (Y | x) is constant for all x. 


y 


Per Capita Consumption, 100 gal/day 
i 


5 [s] 
Population, in io* 
Figure P7.2 


PRUBLEMD Sai 


(i) Determine the least-squares estimates for « and f. 
ii) Estimate sẹ e- : " ; 
© na engineer is interested in studying the consumption of water in 
Urbana (a town with 50,000 population). Assume a normal distribution 
` for Y; determine the probability that the demand for water in Urbana 
will exceed 7,000,000 gal/day. i 
i ion '(i illion, ppm) in a 
.3 Dissolved oxygen (DO) concentration (in parts per mil 
id ican is feu to decrease with the time of travel downstream (Thayer and 
Krutchkoff, 1966). Assume a linear relationship between the mean Do and : 
the time of travel t. Determine the least-squares regression equation an 
estimate the standard deviation about the regression line from the following 
set of observations. ; 


DO (ppm) Time of travel t (days) ` 


0.28 2. 05 
029 . ° 1.0 
` 0.29 1.6 
0.18 1.8 
0.17 26. 
“0.18 3.2 
0.10 3.8 
0.12 4.7 


: 4 
i loss in ridership for transit 
4 From a survey of the effect of fare increase on the loss in riders ; 
Nt Sun throughout the United States, the following data were obtained. 


xX Y . 
Fare increase ` Loss in ridership 
 (%) C9 
5 1.5 
35 : 12.0 
20 T5 
15 '63 
4 ..312 
6 1.7 
18 72 
23 8.0 
38 11.1 
8 3.6 
12 3.7 
17 6.6 
17 4.4 
13 45 
7 2.8 
23 8.0 


328 REGRESSION AND CORRELATION ANALYSES 


ee 


20 ~< Footings 
———— i 
ico ; i . 2 x 19 -«— Palrs Of 

à E Adjacent 

j i Footings 


Figure P7.14 


exists between the settlement behavior of two adjacent footings. The following 
is a set of data on the settlement of a series of footings on sand. 


Footing Settlement (in. Footing Settlement (in.) 


1 0.59 11 0.93 
2 0.60 12 0.78 
3 0.54 13 0.78 
4 0.70 14 0.77 
5 0.75 E 0.79 
6 0.80 16 0.79 
7 '" 079 17 0.78 
8 ' 0.95 18 0.77 
209 1.00 19 0.63 
10 092 ^ 20 0.75 


From a row of 20 footings, 19 pairs of adjacent footings can be obtained as 
- shown in Fig. P7.14. The degree of dependence between the settlements of 
adjacent footings is described by the correlation coefficient. 
(a) Estimate this correlation based on the 19 pairs of data. Ans. 0.766. 
(b) Estimate the coefficient of variation of the settlement of a footing. ‘Ans. 
0.157. 


8. The Bayesian Approach 


8.1. INTRODUCTION 


In Chapter 5 we presented the methods of point and interval estimation of 
distribution parameters, based ori the classical statistical approach. Such an 
approach assumes that the parameters are constants (but unknown) and 
that sample statistics are used as estimators of these parameters. Because 
the estimators are invariably imperfect, errors of estimation are unavoid- 


_able; in the classical approach, confidence intervals are used to express the 


degree of these errors. 

As implied earlier, accurate estimates of parameters require large amounts 
of data. When the observed data are limited, as is often the case in engi- 
neering, the statistical estimates have to be supplemented (or may even be 
superseded) by judgmental information. With the classical. statistical - 
approach there is no provision for combining judgmental information with 
observational data in the estimation of the parameters. 

For illustration, consider a case in which a traffic engineer wishes to 
‘determine the effectiveness of the road improvement at an intersection. 
Based on his experience with similar sites and traffic conditions, and on a 


` traffic-accident model, he estimated that the average occurrences of 


accidents at the improved intersection would be about twice a year: How- 
ever, during the first week after the improved intersection is opened ‘to 


. traffic, an accident occurs at the intersection. A dichotomy, therefore, may 
"arise: The engineer may hold strongly to his judgmental belief, in which 


case he would insist that the accident is only a chance occurrence and the 
average accident rate remains twice a’ year, in spite of the most recent 
accident. However, if he only considers actual observed data, he would 
estimate the average accident rate to be once a week: Intuitively, it would 
seem that both types of information are relevant and ought to be used in 
determining thé average accident rate. Within the classical method of 


‘statistical estimation, however, there is no formal basis for such analysis: 


Problems of this type is formally the subject of Bayesian estimation. 

The Bayesian method approaches thé estimation problem from anothor 
point of view. In this caso, the unknown paramcetcrs-of a distribution are 
assumed (or modeled) to be also random variables. In this way, uncertainty 
associated with the estimation of the parameters can be combined formally - 


329 


330 BAYESIAN APPROACH 


p;* P(8-8) 


Pn E 
9, 82 83 9. . 8 


i Figure 8.1 Prior PME of parameter 0 


(through Bayes’ theorem) with the inherent variability of the basic random 
variable. With this approach, subjective judgments based-on intuition, 
experience, or indirect information are incorporated systematically with 
observed data to obtain a balanced estimation. The Bayesian method is 
particularly helpful in cases where there is. a strong basis for such judg- 
ments. We introduce the basic concepts of the Bayesian approach in the 
: following sections. : : 


$.2. BASIC CONCEPTS—THE DISCRETE CASE 


The Bayesian approach has special significance to engineering design, where 


available information is invariably limited and subjective judgment is ' 


often necessary. In the case of parameter estimation, the engineer often 
has some knowledge (perhaps inferred intuitively from experience) of the 
possible values, or range of values, of a parameter; moreover, he may also 
- have some intuitive judgment on the values that are more likely to occur 
than others. For simplicity, suppose that the possible values-of a parameter 
0 were assumed to be:a set of discrete values 6;, i-1,2,.., n, with 


relative likelihoods .p; = P(O = 6;) as illustrated in Fig. 8.1 (0 is the . 


. random variable whose values represent possible values of the parameter 8), 
` Then if additional information becomes available (such as tlie résults.of- 


a series of tests or experiments), the prior assumptions on the parameter @ . 


may be modified formally through Bayes’ theorem as follows. 
Let e denote the observed outcome of the experiment. Then applying. 


T 


8.2: BASIC CONCEPTS—THE DISCRETE CASE . 331 


Bayes' theorem of Eq. 2.18, we obtain the updated PMF for © as 


pio =0;(9 -ZETA POT FG, 8I) 


L P(e| © =4:) P(0 = 8) 


-© "The various terms in Eq. 8.1 can be interpreted as follows: — 


P(e] © = 6) = the likelihood of the experimental outcome eif 
: © = 6; that is, the conditional probability of . 

obtaining ‘a particular experimental outcome 

: - "assuming that the parameter is 0; 

P(0 = 6;) = the prior probability of © = 6; that is, prior to . 

_ the availability.of the experimental information « 

P(0:— 0: | €) = the posterior probability of © = 0; that is, the 

i probability that has-been revised in the light of 

the experimental outcome e i $ 


Denoting the prior and posterior, probabilities as P'(0- 0) and. 
.P" (0 = 0;), respectively, Eq. 8.1 becomes MEO : 


P” (O. = 6;) = EACC (8.14) 


+ P(e| © = 0) P'(0 = &) 
n E * 


: Equation 8.14, therefore, gives the posterior probability mass function of 6. 
. (In general, we shall use ’ and " to denote the. prior and posterior). 


The expected value of © is then commonly used as the Bayesian estiniator* 


of the parameter; that is, 


B= EOL) = XP -9 - (8.2) 


. "We may point out that in Eq.'8.2 observational data and judgmental 
information are both- used and combined in a systematic way to estimate 
the underlying parameter. . : . 

Ín the Bayesian framework, the significance; of judgmental information 
is reflected also in the calculation’ of relevant probabilities. In the easé 
above, where subjective judgments were used'in the estimation of the 


parameter 6, such judgments would be reflected in the calculation of the 


: probability, for example, P(X X a), through the theorem of total proba- 


* There are other Bayesian estimators depending on the assumed form of the “loss 
function” (discussed in Vol. II). Moreover, other. parameters of the posterior distribu- 
tion may serve as the estimator instead; for example, the mode. 5 


.332 BAYESIAN APPROACH 


bility using the posterior PMF of Eq. 8.10. That is, 


P(X <a) = E P(X < a19 = 6) P"(6 = 6) (83) 
| i=l » š 

This represents the up-to-date probability of the event (X < a) based on, 

all available information: It may be emphasized that in Eq. 8.3 the un- 

certainty associated with the error of estimating the parameter [as re- 


flected in. P” (0 = 6;)] is combined with the inherent variability of the 


random variable X... 
. To clarify these general concepts, consider the following examples. 


EXAMPLE 8.1 


Piles for a building foundation were initially designed for 250-ton capacity each; 
however, this did not include the effect of high winds that occur only very rarely. On 
such rare occasions, it is estimated that some of the piles may be subjected to loads as 
high as 300 tons. In order to assess the safety of the initial design, the engineer in 
charge wishes to determine the probability of the piles failing under a maximum load 
of 300 tons. d : 

Suppose that from the engineer's experience with this type of piles and the soil 
condition at the site, he estimated (judginentally) that the probability p would range 
from 0.2 to 1.9 with 0.4 as the most likely value; more specifically, p is described by 
the prior PMF shown in Fig. E8.1a. The values of p are discretized at 0.2 intervals to 

_ Simplify the presentation. i i » 

On the basis of this prior PMF, the estimated probability of a pile failing at a load 

of 300 tons would be (by virtue of the total probability theorem) 


pe (0.2)(0.3) + (0.4)(0.4) + (0.6)(0.15) + (0.8)(0.10) -+ (1.0)(0.05) 
= 0.44 ` » 2 . 

` In order to supplement his judgment, the engineer ordered a pile of the same type 

test-loaded at the site to a maximum load of 300 tons. The outcome of the test shows 


` that the pile failed to carry the maximum load. Based on this single test result, the 


TME of p would be revised according to Eq. 8.1a, obtaining the posterior PMF as 
ollows: : i 


P'(p. 02) : (0.2)(0.3) 
(0.2)(0.3) + (0.4)(0.4). + (0.6)(0.15) + (0.8)(0.1) + (1.0)(0.05) 
à $ = 0.136 
and, similarly, N à E 
: P'(p = 0.4) = 0.364 
P'(p = 0.6) = 0.204. 
P'(p = 0.8) = 0.182 
$ P'(p = 1.0) = 0.114 
which are shown graphically in Fig. £8.15. 
The Bayesian estimate for P» Eq. 8.2, therefore is 
$" = E(p|s) = 020.136) + 0.4(0.364) + 0.6(0.204) + 0.8(0.182) + 1.0(0.114) 
l =0.55 s 


8.2. BASIC CONCEPTS—THE DISCRETE CASE 333 


P'tp*pi) 


o O02 04 06 08 10 Pi 
Figure E8.la Prior PMF of p 


i d test, the 

result of the single unsuccessful load test, the, 

iliti i i ed from those of the prior distribution, 

bilities for higher values of p; are increased ec n 
Ponting ina higher éstimate for p, namely, p” = E(p | e) = 0.55, whereas the pri 


i ; ile does not imply the 

i 44. Observe that the failure of one test pi : 
faposibiliy of such piles carrying the 300-ton load; instead, the D rew maoy 
E us to increase the estimated probability by 0.11 (from 0.44 to 0.55). e iaa 
illustrates how the PMF of p changes with nk umber of consecu pi 

i i istribution shifts toward p = 1.0 as n — o9. "x 
toes ER d. in the corresponding irl estimate for 4 ^ ras ied ae 

: i ilures the estimate for p is 0.90. 

Se rites i ches 1.0—a result that tends to 

i is observed, the Bayesian estimate of p approa 4 J 
scans estimate; in ek acase, there is overwhelming amount of observed data 


In Fig. 58.18, we sce that as a 


P" (p=pj) 


o pri 0.4 0.6 
Figure E8.1b Posterior PMF of p 


0.8 E Pi 


Ses MERELY AERE NVUACH 


P" (psp) Pp «pi 


O02 04 O6 OB |O Pi 02 04 06 08 |O Pi 
P"(p= pj) "E P" (ps pi) 


02 04 06 O8 1,0 Pi 
Pips pj) 


02 04 O6 OB |o. Pi 02 O4 06 08 10 Pi 
Figure E8.1c PMF of p for increasing number of test pile failures 


to supersede any prior judgment. Ordinarily, however, where observational data 
are limited, judgment would be important and is reflected properly in the Bayesian 
estimation process. : 

Now suppose that each main column is supported on a group of three piles. If the 
piles carry equal loads and are Statistically independent, the probability that none of 
the piles supporting a column will fail at a total columri load of 900 tons (300 tons 
per pile) can be obtained by Eq. 8.3. Based on the posterior PMF of Fig. E8.16,.and- 
denoting X as the number of piles failing, the required probability is 


PX = 0) = PX = 0|p = 0.2)P"(p = 0.2) + PX = 0|p = 04P'( 

ae +++ + P(X =0|p = 1.0)P(p = 1.0) : 
= (0.8)°(0.136) + (0.650.364) + (040.204) + (0.220.182) . 
= 0.163 


8.2. BASIC CUIYNGEE 12 LD DIOURE EE UAL vey 


0.8} 
0.6 
0.4 


- 0.2) 


[9 2 4 6. 8 10 
Number Of Consecutive Follures 


Figure E8.1d ý" vs. no. of consecutive failurés - 


EXAMPLE 8.2 ` 


A traffic engineer is interested in the average rate of Lemaire » at n ua 
road intersection. Suppose that from his previous experience with similar 


- traffic conditions, he deduced that the expected accident rate would be between one 


i in Fig. E8.2. 

and three per year, with an average of two, pen the pron EME shown in Fig. E8. 
nce of accidents js assumed to be a Poisson. process. ae 
Men the first month after completion of the oriens one accident occurred 
` i i eřvątion i timate for ». . ] 

4) In the light of this observation, revise the est fi 3 g 
e Using ihe result of part (a), determine the probability of no accident in the 
next six months. 5 Sees : 


Solutions f f 
(a) Let & be the event that an accident occurred in one month. The posterior 
probabilities then are : ' 
i P(e|» = Y)P'(v =1) : ; 
Por) P(s|v = DP'(» = 1) + P(e|v = 2P'(» = 2) 4 P(e |» = 3P'(» = 3) 


e1220/12)(0.3) 7 
= FÜRQ2)03) + eV(/6)04) + e0/4)0.3) 


_ = 0.166 
Similarly, 


P'O =2) = 0411 
Py = 3) = 0.423 


330 BAYESIAN APPROACH 


P'(y) 


le) ! 2 3 


= v, per year 
Figure E8.2 Prior distribution of y 


Hence the updated value of v is 


$' = E(v| 2) = (0.16601) + (0.4112) + (0.423)(3) 
=-2.26 accidents per year 


(b) Let A be the event of no accidents in the next six months. Then 


P(A) = FA |» = DPO = 1) 4-P(4|» 2" 
=2P'@ =2) +P(A |» =3)P'@ = 
= e¥7(0,166) +.eA(0.411) + 6-9/2(0,423) a daa 
= 0.346 ‘ 


8.3. THE CONTINUOUS CASE . 


8.8.1. General formulation 


In Section 8.2 thé possible values of di pane 
8.1 and » in Example 8.2) were limited to a dis 
purposely assuméd to simplify the presentatio 


ter 8 (such as pin Example 
crete sct of values; this was 
n of the concepts undérlying- 


f'(a) 


8; 8, *A8 G 


Figure 8.2 Continuous prior distribution of parameter @ 


8.3. THE CONTINUOUS CASE 337 


the Bayesian method of éstimation. In many situations, however, the value 
of a parameter could be in a continuum of possible values. Thence, it 
would be appropriate to assume the parameter to be a continuous random 
variable in the Bayesian estimation. In this -case we develop the corre- 
sponding results, analogous to Eqs. 8.1 through 8.3, as follows. ` 
Let © be the random variable for the parameter of a distribution, with a 
prior density function f' (8) shown in Fig. 8.2. The prior probability that 6 
will be between 6; and 6; + A6 then is f’ (6;) A6. Then, if e is an observed: 
experimental outcome, the prior distribution f’ (6) can be revised in the 
light of e using Bayes’ theorem, obtaining the posterior probability that 6 
will be in (6, 0; + A9) as. ` sod 
y 

pragas = ZEL Oa 

2; P(e 8a) (0 40 

pa 


where P(e] 0:) =P (e| 0: < 6 < 0: + A6). In the limit, this yields 


-po PAo l (8.4) 
f P(e | òf’ (6) do 


"The term P(e | 8) is the conditional probability or likelihood of observing 


the experimental outcome e assuming that the value of the parameter is 0. 
Hence P (e | 0) is a function of 0 and is commonly referred to as. the likeli- 
hood function of 0 and denoted L (0). The denominator is independent of 4; 
this is simply a normalizing constant required to make f"(8) a proper 
density function. Equation 8.4 then can be expressed as 


j"(0) = KLF (0) : (8.8) 


z [-] j P -1 
where the normalizing constant k = | f L@S (6) J „and 


L(@) = the likelihood of observing the experimental outcome e assuming a 
givenü. .- f w : E 

We observe from Eq. 8.5 that both the prior distribution and the likeli- 
hood function contribute to the posterior distribution of ©. In this way, as 
in the diserete case, the significance of judgment and of observational data 
is combined properly and systematically; the former through f’ (0) and 
the latter in L (8). - em oa KT RE C : 

'Analogous to tho diserete case, Eq. 8.2, the expected value of O is 
commonly used as the point estimator of the parameter. Hence the updated 


226 BAYESIAN APPROACH 
estimate of the parameter 8; in the light of observational data e, is given by 
i" = Elo = f @ a / (86) 
F * ^ * es + 
The uncertainty in the estimation of the parameter Can be included in 


ihe calculation of the probability associated with a value of the underlying 
random variable. For example, if X is a random variable 


PR <a)= f PESAN ^ D 


Physically, Eq. 8.7 is the average probability of (X <a) weighted by the - 


posterior probabilities of the parameter 6. ` 


EXAMPLE 8.3 


Consider again the problem of Example 8:1, in which the probability .of pile 
failure at a load of 300 tons is of concern; this time, however, assume that the 
probability p is a continuous random variable. If, there is no (prior) factual’ in- 
‘formation on p, a uniform prior distribution may be assumed (known as the diffuse 


prior), namely, ; » ! 
ud fo =1.0 . Ogp <i š 


On the basis ofa single test, the likelihood function is simply the probability ofthe ` 


event e = capacity of test pile less than.300:tons, which is simply p. Hence the 
posterior distribution of p, according to Eq. 85, is... D : 


fio «pii 


m [fre] =2 


; f'p-À' OSp si. 
The Bayesian estimate of p then is - oe 

P = Ep) = EZ 

= 0.667- E 


B Sequence fn piles were tested, out of which r piles failed at loads less than the 
maximum test load, then the likelihood function is the probability of observing 
` rfailures.among the n piles tested. If the failure probability of each pile is p, and 


~. in which the constant 


"Thus 


statistical independence is assurned between piles, the likelihood function‘would be - 


Lp) = ("ore = py, 
Thien, with the diffuse prior, the posterior ‘distribution of p becomes 


£o «Cra pr, O<ps 1 


E 8.3.. THE CUNTINUUUS CASE 539 


een 


Thus the Bayesian estimator is ; ' 
M n 
[ora - py dp 


a 
T : 
free 
9 V, R 
i 
fira — p'- dp E 


= 
Í, p — py dp 


where 


| K-Bpl3- 


Repeated integration-by-parts of the above integrals yields 


1 ; 
s (gn — patty qi 
gerit pry dp 
vs 2% 
ig? f (p^? — ") dp 
B MO 
rl 
Sat? 


From this result, we may observe that as the number of tests 1 iricreases (with the 
ratio r/n remaining constant), the Bayesian estimate for p approaches that of the 
classical estimate; that is,- ` 


"ar 


2rtlor 


nando. ^ for large n 


EXAMPLE 8.4 


* An engineer is designing a temporary structure subjected to wind load on a newly 
developed island in the Pacific. Of interest is the probability p that the annual max- 
imum wind speed will not exceed 120 km/hr. Records for the annual maximum wind 
speed in the island are available only for the last five years; and among these, the 
120 km/hr wind was exceeded only once. However, an adjacent island has a longer 
record of wind speeds. After a comparative study of the geographical condition in 
the two islands, the engineer inferred from this longer record that the avera se value 
of p for the newly developed island is 2/3 with aCOV of 2196. Since pis bounded - 
between O and 1.0, the following beta distribution (consistent with the above 


statistics) is also assumed for the prior distribution: 
l . fp -zyppü-p  Oxpxzl 
In this case,'the likelihood that the annual maximum wind speed wi exceed 120 kph 
in one out of five years is P 


` Lp) = ( jr =p} 


vr DALEDIAN AUF KVALI 


fip) 


Posterlor 


[j Q2 043: . 06 0.8 107 p 


Figure E8.4 Prior, likelihood, and posterior functions 


Hence the posterior density function of p is 
fp -khpr 
T {5 : ES 
- d (ra - p aora -p. 


i ` = 100kp"(1 — py 
where 


1 -1 $i 
k= p 100p — p? «| =3.6 
o 


f'()-36pü -pè Ospsl 


In this case, the prior density function is equivalent to the assumption. of one 
exceedance in four years, whereas the resulting posterior distribution is tantamount 
to two exceedances in nine years. In fact, the above posterior distribution is the 
same as that obtained for a case in which two exceedarices were observed in nine 
years and a diffused prior distribution is assumed. This example should serve also to 
illustrate a property of the Bayesian approach—namely, that information from 
sources other than the observed data can be useful in the estimation process. 


Thus 


The relation between the likelihood function and the prior and posterior dis-, 


tributions of the parameter p is illustrated in Fig. E8.4. Observe that the posterior 
distribution is “sharper” than ‘either the prior distribution or the likelihood 


function. This implies that more information is "contained" in the posterior dis- . 


tribution than in either the prior or the likelihood function. 


EXAMPLE 8.5 


The occurrences of earthquakes may be modeled as a Poisson process witli mean 


occurrence rate » (Benjamin, 1968). Suppose that historical record for a region A ` É 


8.3. THE CONTINUOUS CASE 34 


shows that z earthquakes have occurred m the past fy years. The corr di 
likelihood function is then given by — — ee ILE 


L(») = P(n, quakes in ty years | ») 
Ly" 
- o evo »20 E 
If there is no other information for estimating », a uniform diffuse prior may be 
assumed; this implies that /'(») is independent of the values of » and. thus can be 


` absorbéd into the normalizing constant k. Then the posterior distribution of » 


becomes 


S'O) = kLo) 


"^ x 
a CRT cn »20. 
a! 


Upon normalization, k.= tą; this result may also be obtained by comparing the 


' foregoing f"(») with the gamma density function of Eq. 3.445 (for the’ random’ 


variable »). . 
The probability of the event (E = n earthquakes in the next ? years in region’ A) i 
then given by Eq. 8.7 as follows: 1 4 B See 


P(E) = f * PUE|») f°) de 
0 e 


E Í : e ens forte)" s d» 
o n MU EN 

af * (t + typ + Pe NT 449 dy (a En)! o pues 
Jo (n Tn) nln! (fy 


x ' 
‘Since the integrand inside the parentheses is a gamma density function, the integral 


is equal to 1.0. Hence 
PU) (ano)! s enter) t+ no)! (ty 
nino! (E + tortni ming! (1 + tft) ttroti 
a result that-was first derived by Benjamin (1968). 
: As an illustration, suppose that historical records in region. A show.that two 


-earthquakes with intensity exceeding VI (MM scale) had occurred in the; last 60 
+ years. The probability that there will be no earthquakes with.this intensity in the 
next 20 years, therefore, is — ' : ` 5; 


E © 4-2)! (20/60 
PE) = “Oar + 20]60)- 


= 0.42 


8.3.2. A special application of Bayesian up-dating process 


| An interesting application of the Bayesian updating process.is in the 


inspection and detection of material defects (Tang, 1973). Fatigue and 
fracture failures‘in metal structures are frequently the result of unchecked’ 


“propagation of flaws or cracks in’ the joints (welds): or base metals. Peri- 


odie inspection and repair can be used to minimize the risk of fracture: - 


342 BAYESIAN APPROACH 


0.5) 


Probability Of Flaw Detection 


0,08 0.12 ONG 


0 ~ 0,04 
d Fiaw Size, in. 


Figure 8.3 Detectability versus actual flaw depth (data from Packman et al, 
1968) j 


failure by limiting the existing flaw sizes. Methods of detecting flaws, such 
as nondestructive testing (NDT), however, are invariably imperfect; 
consequently, not all flaws may be detected during an inspection, 

' The probability of detecting a flaw generally increases with the flaw size 
and the.detection power of the device. An example of a detectability curve 
for ultrasonics method is shown in Fig. 8.3. Hence, even when a structure is 
inspected and all detected flaws are repaired, it is difficult to ensure that 
there are no flaws larger than some specified size. 

: Suppose that an NDT device is used to inspect a set of welds in a struc- 
ture and all detected flaws are fully.repaired. Où the basis of this assump- 
tion, the flaws that remain in the weld would be those that were not 
detected. Let X be the flaw size and. D the event that a flaw is detected. 
The probability that a flaw size (for example, depth) will be between z and 
(x + dx) given that the flaw was not detected is, therefore, 


P(x < X < z+ dr] D) = 


P(D) 
This-can be expressed also in terms of density functions as 
Jele | D) = kP(D | fe) 250, G8 


in which fx (x) is'the distribution of the flaw-size prior to inspection and 
repair, whereas fx (x | D) is the corresponding distribution after inspection 
and repair. Also P(D |x) —1 — P(D|z), where P(D | x) is simply the 


P(D|z «X <a + dr)P(e < X sat dry 


8.3. THE CONTINUOUS CASE 343 


probability of detecting.a flaw with depth x, which is the function defined 
by the detectability curve, such as that shown in Fig. 8.3. Comparing Eq. 
8.8 with Eq..8.5, we observe that Eq. 8.8 is of the same form as Eq. 8.5, with 
the following equivalences: 


fx (z | D) ~ the posterior distribution 
P(D| x) ~ the likelihood function 


fx (x) ~ the prior distribution 
EXAMPLE 8.6 


As an illustration, suppose the initial (prior) distribution of flaw depths X ina 


` series. of welds has a triangular shape described as follows (see Fig. E8.6): 


208.3x ` 0 <x «0.06 
Fx) = (20 — 125x 0.06 < x < 0:16 
0 x »046 ^ 


` Assume also that the NDT. device used in the i inspection has the detectability curve 


shown in Fig. 8.3;.mathematically, this curve is given by 


0 x<0- 
PCD | x) = (8x 0 <x < 0.125 
1.0 x > 0.125 


Substituting the appropriate expressions for each interval of X into Eq. 8.8, we 


15 Likelihood 


Posterior ` 


. 0 d 0.05 0.10 0.15 Xx 


. Flow Depth, in, 
Figure E8.6 Distribution of flaw depth. 


344 BAYESIAN APPROACH 


` obtain the updated density function of flaw depths 


‘0 x<0 
Sx@|D) = fa — 8x)(208.3x) g 0<x <0.06 - 
Y k(1—8x)20 — 125x) .. 0.06 < x €0.125 ` 
07 l : x > 0.125- 
which, after normalization, becomes ae 
0 d i x<0 
fx(x |B) = s — 3964x? 0 < x < 0.06 
47.6 — 678x + 2379x? ` ` 0.06 < x € 0.125 
0> 10x 20425. 


The above prior, likelihood, and postetior functions are plotted in Fig. E8.6. It can 
be observed that the likelihood function, which is the “complementary function” of 
Fig. 8.3, behaves asa filter; it cuts off flaws larger than 0.125 in. and also eliminates 


many of the remaining larger flaws. Thus, after the inspection and repair program, 


the distribution of flaw depth is shifted toward smaller values. 


8.4. BAYESIAN CONCEPTS IN SAMPLING THEORY 


:8..1. General formulation . ] 
“If the experimental outcome e in Eq. 8.4 is a set of observed values 
244,23... , tt», representing a random sample (see Section 5.2.1) from a 
‘population X with underlying density function fx (x), the probability of 
. observing this particular set of values, assuming: that the parameter of the 
' distribution is 6, is 

: m 

P(e] 0) = I fxGilo) dz 
$ " del . 
Then; if the prior density function of © is J’ (6), the corresponding posterior 


density function becomes, according to Eq. 8.4, "sr 


i=l 


i [Ir CAL d] P0 Y 


rO -—— IE 
[Beine 


= kL(f' (0) (8.9) 


in Which the normalizing constant is 


he Lf M (ù f ei e)r (0) al. 


“whereas the likelihood function Z (0) is the product of the density function 


8.4. BAYESIAN CONCEPTS IN SAMPLING THEORY 345 


of X evaluated at zı, 25, .. - , Za, OF 


210 = [Leese (8.10) 


Using the posterior density function for © of Eq. 8.9 in Eq. 8.6, we there- 
fore obtain the Bayesian estimator of the parameter 6. It is interesting to 
observe that the likelihood function of Eq. 8.10 is the same as that given 
earlier in Eq. 5.4 in connection with the classical method of maximum 
likelihood estimation. Furthermore, if a diffuse prior distribution is 


assumed (for example, as in Eq. 8.13), then the mode of the posterior 


distribution, Eq. 8.9, gives the maximum likelihood estimator. . 


8.4.2. Sampling from normal population 


In the case of a Gaussian population with known standard deviation o, 
the likelihood function for the parameter p, according to Eq. 8.10, is 


dE IL [- ;C = ey] T E i pP 


where N,(z;, o) denotes the density function of y with mean value z; and 
standard deviation c. It can be shown (for instance, Tang, 1971) that the 
product: of m normal density functions with respective means n; and stand- 
ard deviations c; is also a normal density function with mean and variance 


b» (ni0?) 


wt — 39. ~ and (e*)? — 


= = (8.11) 
»» le? į È Vo? 


Therefore the likelihood function L (u) becomes 


È Guo) i (vois i | 
= N; isl 


Lu) =N, z ? n 2 D 2 
. È (Ae?) JÈ (1/0) nfo Vajo 
=N, (z A] i . (8.12) 


. where Z is the sample mean of Eq. 5.1. 


Without prior information. In the absence of prior information on 
u, a diffuse prior distribution may be assumed. In such a case we obtain 


346 | BAYESIAN APPROACH 


the posterior distribution for p, as 


9) = BL) 
E 


Mn) E: 8.18) 


where k.is necessarily equal to 1.0. upon normalization. Therefore, without 
prior information, the posterior distribution of is Gaussian with a mean 
value equal to the sample, mean # and standard deviation o/+/n. 
Using the expected value of „as the Bayesian estimator we obtain, in 
‘accordance with Eq. 8.6, 
"= Bule =6 


"That is; the sample-mean Z is the point estimate of the population mean. 
We recognize that this'is the Same as the classical estimate of Eq. 5.1. 
Therefore, in the absence of prior information, the "Bayesian and classical 
methods give the same estimates for the population mean. Conceptually, 
however, the Bayesian’ basis for-this estimate differs from that of the 
classical approach. Whereas Eq. 8.13 says that the posterior distribution of 
n is Gaussian with mean @ and standard’ deviation. o/ /n, the classical 
-approach (of Sect. 5.2) says the sample. mean X'is:aà Gaussian random 
-variable with mean u and standard deviation. p An. 


Significance of prior information. In contrast to the classical ap- 


proach, however, prior information ean be included in the estimation of 


the parameter p. This is accomplished explicitly through the prior dis- . 


tribution f'(0) ; we demonstrate this for the case of a Gaussian population 
as follows, 

In the case bee X is Gaussian with known variance, it is mathe- 
matically convenient to assume also a Gaussian prior (see Sect. 8.4.4). 
“Suppose that f'(u) is N (u', s"). Then, with the likelihood function of Eq. 
8.12, the posterior distribution of « becomes 


f'() = RLG (u) 


i ë : 
= kN, (s 3 N,(u', 0”) 

which is a product of two normal density functions Again, it can be shown 

that f" d (u) is also Gaussian with mean 


Be) D/O) 2 atr 
E 7 AVIFES O] OE (Um 


. (8.14) 


8.4. BAYESIAN CONCEPTS IN- SAMPLING THEORY. 347 , 


[en " 
T T Ny oh) (8.18) 


and standard deviation 


In this case the Bayesian estimator of y, Eq. 8.6, yields 


B" = yu , 
That i is, the Bayesian. estimate of the mean value iş an average of the prior 
mean p’ and the sample mean ž, weighted inversely by the respective 
variances. 
Equation 8.14 is an example of how prior information: is’ combined 
systernatically with observed data—in the present case, to estimate the 


mean value u. 
It is important to observe that the posterior variance of: p, as given by 


Eq. 8.15, is always less than* (o’)? or (c?/n) ; that is, the variance of the 
posterior distribution is always less than that of the prior distribution or 
of the likelihood function. f 

On the basis of the posterior distribution of u, that is, N,(Z, o/ A/n) 
of Eq. 8.13 or N, (u^, o") with Eqs. 8.14 and 8.15, we may also determine 
the interval for y corresponding toa specified probability. For example, 
the probability that u is between a and b is given by 


Pansy = f foods 


8.48. Error in estimation 


Any error in the estimation of a parameter 0 can be combined with the in- 
herent variability of the underlying random variable, for example X, to 
obtain the total uncertainty associated with X. Accounting for the error in 
the estimation of 6, the density function of X becomes (by virtue of the: 


* Since (o? > 0, and e?/n > 0 


ie Bp (yl? È RN [ii 
. (a2. + Qv (2) z e) 
Gy S + 2) >) (2) 
n n 


Ney eem. 
e» rer ne 


Similarly, it can be shown that (c")* < c*/n. 


or 


548 BAYESIAN APPROACH E 


total probability theorem) 
fet) -[ Rel - 61 


In the case of a Gaussian variate X, with known o, and p estimated 
from sample data, $ 


dee) = f fe hof" W) da 


where fx ( | 2) = Nx (u, e), and f" (x) is given by Eq. 8.13. Again it can 
be shown (for instance, Tang; 1971) that this last integral yields the normal 
density function Nx (Z, ^/a? + a/n) ; that is, a 
fx(z) = N (& VEF em) (8.17) 
The overall uncertainty in X here is reflected in its variance, o? + o?/n, 
which is composed of the variance of the basic random variable X and that 
of the parameter u. Effectively, the error in the estimation of » serves to 
increase the total uncertainty in X, by:an amount that decreases with the 
sample size n. 
EXAMPLE 8.7 


A toll bridge was recently opened to traffic. For the past two weeks, records on 


rush-hour traffic during the last 10 workdays showed a sample mean of 1535 - 


vehicles per hour (vph). Suppose that rush-hour traffic has a normal distribution 
with a standard deviation of 164 vph. Based on this observational information, the 
posterior distribution of the mean rush-hour traffic » is, according to Eq. 8.13, 
N (1535, 164/ V10) or N (1535, 51.9) vph. The point estimate of 4, therefore, is 
1535 vph. i 

. The probability that p will be between 1500 and 1600 vph is given by 


"MG 1600 — 1535 1500 — 1535 
P(1500 «58x 1600) - (m) = (E 


= 0(1.253) — 0(—0.674) 
= 0.6445 g 
Of greater interest are probabilities associated with the rush-hour traffic (rather 
than its mean) on a given workday. Suppose that for the present toll collection 
procedure, serious problems would arise if the rush-hour traffic exceeds 1700 vph on 


a given day. Then the probability that this will occur on any given day, based on 
Eq. 8.17, is given by 


2 -P(X > 1710) =1 = 0 


^ 1700 — 1535 ` 
(arse) 
= 1 — 4(0.958) 
- = 0.169 


In other words, in about 17% of the working days, the present toll collection system 


“84, BAYESIAN CONCEPTS IN SAMPLING THEORY 349 


will be inadequate during rush hours. Observe that the error in the estimation of 4 
has been included in computing this probability. 

Now suppose that before the toll bridge was opened for traffic, simulation was 
performed to predict the rush-hour traffic on the bridge. Based on the simulation 
results alone, it was estimated that the mean rush-hour traffic on a workday would 


. be 1500 + 100 with 90% confidence. How can this information be used with ‘the 


observed traffic flow in the estimation of #? 
Assuming a Gaussian prior and with the foregoing simulation results, we obtain 
the prior distribution of the mean rush-hour traffic 4 to be N (1500, 60.8) vph. Then, 
applying Eqs. 8.14 and 8.15, the posterior distribution of x is Gaussian with 
» ..1535(60.8)* + 1500(51.9)? 
(60.8)? + (51.9)? 


[365 _ 
a = ECONA = 39.5 vph 


Therefore, by incorporating the result of simulation, the estimated mean rush-hour 
traffic is 1520 vph and corresponding standard deviation is 39.5 vph. 


= 1520 vph 


and 


EXAMPLE 88 


Five repeated measurements of the elevation (relative to a fixed-datum) of a 
bridge pier under construction were made as follows: - : 


20.45 m 
20.38 m 
20.51 m 
20.42m 
2046m . 


Assume that the measurement error is Gaussian with zero mean and standard 
deviation 0.08 m. . $e 
(a) Estimate the actual elevation of the pier based on the given measurements, 
(b) Suppose that the elevation of the pier was previously measured by another 
surveying crew; the elevation was estimated to be 20.42 + 0.02 m (that is, the mean 
measurement was 20.42 m with a standard error of 0.02 m). Estimate the elevation of 
the pier taking advantage of this prior information. 


Solution 

The estimation of an actual dimension 6 in surveying and photogrammetry is 
equivalent to the estimation of the mean value of a random variable (see Section 
5.2.3). Measurement error is invariably assumed to be Gaussian with zero mean; 
this means tacitly that a set of measurements constitute a sample from a normal - 
population. Therefore the results derived in Section 8.4.2 are applicable to.the 
estimation of geometric quantities in surveying and photogrammetry. à 

(a) The sample mean of the five measurements is i 


d = 4(20.45 + 20.38 + 20.51 + 20.42 + 20.46) 
= 20.444 m ` : 


Hence, on the basis of the five observations, the actual elevation of the piér has a 


350. BAYESIAN APPROACH 


ausin Sistaputian N (20.444, 0.08/ V 5) or N (20.444, 0.036) m. In the convention 
of surveying and photogrammetry, the elevation of the pier would i 
20.444 + 0.036 m: : n Mic given as 
. (b) In the case where prior information is available, such information can be 
incorporated through the prior distribution of 6, In the present case, using the pier 
sean Sree earlier by another crew, the prior distribution of 6 can be 
modeled as N (20.420, 0.020) m. Then applying Eqs. 8.14 and 8.15, the Bayesi 
estimate of the elevation is nee ee 


2» _ (20.420)(0.036y* + (20.444)(0.020)* 


E 
(0.036)? + (0.020)? 
= 20.426 m 


and the corresponding standard error is 


o^ = | 00300.0207 
4 (0.036)? + (0.020)? 
= 0.017 m 


EXAMPLE 8.9 


i ns annual maximum flow of a stream has been recorded for the last five years as 
ollows: 


21.5, 19.2, 23.4, 20.1, 18.1 (100 m/sec) 


Based on extensive data from adjacent streams, the annual maximum stream flow 
may be modeled by a log-normal distribution. Assume that the parameter ¢ in the 
log-normal distribution is equal to the value obtained from the Bee sample values. 
The problem here is to estimate the parameter 4. 

In Chapter 4 (Example 4.2) it is shown that if a random variable Y is log-normal, 
then X = In Y is normal. Hence the logarithm of the stream flow will be Gaussian 
with mean 4 and known standard deviation 2. U j 

The naturai logarithm of the above data values are, respectively, 


3.07, 2.96, 3.15, 3.00, 2.90 
Ns we obtain the sample mean X - 3.016, and sample standard deviation 
Without any prior information, the posterior distribution of 4, according to 
Eq. 8.13, is N (£, v5) or N (3.016, 0.097/ V5) = N (3.016, 0.043). 
Af prior information is available, it can be incorporated through the prior dis- 
tribution of 4. For example, suppose that f'(4) is assumed to be N (2.9, 0.06); then 
frorn Eqs. 8.14 and 8.15 the posterior distribution f(A) will be normal with 


» ., 3-016(0.06)* +- 2.9(0.0435)? 


Pa (0.06)? - (0.0435 an 
and " 
0.06)(0.0435)* 
r Lj. 0604357. 0o35 


a =N (0.06). + (0.0435) 
. That is, in this latter case, the posterior distribution of 4 is N (2.98, 0.035). 


8.4. BAYESIAN CONCEPTS IN SAMPLING ABUT aL 


8.44. Use of conjugate distributions 


In deriving the posterior distribution of a parameter by Eq. 8.5 or 8.9, 
considerable mathematical simplification can be achieved if the distribution 
of the parameter is appropriately chosen with respect to that of the under- 
lying random variable. We saw this in Sect. 8.4.2 in the case of the Gaussian 
random variable X with known c; by assuming the prior distribution of u 
to be also Gaussian, the posterior distribution of » remains Gaussian. This 
was similarly demonstrated for the discrete case in Example 8.4, in which 
the random variable has a binomial distribution and the prior distribution 
for p was assumed to be a beta distribution (with parameters g' = 4 and 
v' = 2). The resulting posterior distribution for p is also a beta distribution, 
with updated parameters q” = 8 and »" = 3. 

Such pairs of distributions are known in the Bayesian terminology as 
conjugate pairs or conjugate distributions. By choosing a prior distribution ' 
that is a conjugate of the distribution of the underlying random variable, 
convenient posterior distribution, which is usually of the same mathe- 
matical form as the prior, is obtained. This has been illustrated earlier in ' 
the case of the normal-normal and the binomial-beta distributions. Other 
pairs of conjugate distributions may be developed; Table 8.1 summarizes 
some of these involving certain common distributions. 

It should be emphasized that conjugate distributions are chosen solely 
for mathematical convenience and simplicity. For a random variable with a 
specified distribution, its conjugate prior distribution may be adopted if 
there is no other basis for the choice of the prior distribution. However, if 
there is evidence to support a particular prior distribution, then such a 
distribution ought to be used, mathematical complications notwith- 
standing. 


EXAMPLE 8.10 : 


The occurrence of flaws in a weld joint may be modeled by a Poisson process with 


“a mean occurrence rate of u flaws per meter of weld. Actual observation with a 


powerful device (assume it would not miss detecting any significant flaw) detected 5 
flaws in a weld of 9.2 meters. However, from previous experience with the same type 
of weld and quality of workmanship, the mean flaw rate is believed to be 0.5 flaw/m 
with a COV of 40%. Determine the mean and COV of p for this type of weld, using 
the observed data as well as the information from prior experience. 

Since the number of flaws in a given weld length is described by the Poisson 
distribution, it is convenient, according to Table 8.1, to prescribe its conjugate 
gamma distribution as the prior distribution for the parameter g. From the in- 
formation given above, and observing from Section 3.2.8 the mean and variance of 
the gamma distribution, we have 

" 


EW) = s =0.5 


Table 8.1 Conjugate Distributions . c 
em di cc S eec S 
Param- Prior and posterior distributions 


Basic random variable f eter U of parameter ' 
Binomial Beta 
n : Tra +r) D 
x) =, 6:(1—0)7 a C) ge-1(1 — gri 

px (z) () C ) i fe) = Fern”? ¢ ) 
— TS 
Exponential d Gamma 

; S ; 3 k-lg=à 
fr(z) = M9 $ E X AD NC) is 


$ T (k) 


Normal | i Normal 


, d) = E «| - (= | Boc Su(u) = | - ez] 
A Li d 


(with known e) 


Normal : Gamma-Normal 


EV 
sete) e| - 4025] sermo 


Li PER 
S AVe m PL Aer vis 


Cin = 1)/2]9*02 / gXo-w: 
I erne V 


Poisson m s Gamma 
pt)? . i E v (vu) -ter 

Px (x) en Le dal e 
Lognormal "ue D Normal . 

eor. VARN 
O= ia a AO = Fee “Í - ( P Jl 

2if(inz—-AV[ i ` 
* exp A r I 

(with known ¢) D 
352 


Be 


Mean and Variance of Parameter Posterior Statistics 
Fe) =; LE REA 
——  — rar tne 
Var) toG tr +) 
zo) =Ë att En 
ur. : = 
Var = 5 i kr etn 


1 m/n) + fog! 
EW =m. : m T Tet fn + (ny) 


x " / A 
Var(u) = a? . n (e,)* + e?/n. 


Ela) = i nunda 
n'a" = n'a + nE 
Varg) = 3 Ex : 
a0) 7 "n — 3) (ur — Ds atat 
NX [f L rE — 273 = P! Det + at 
Ee) = —3— Tf - 1/2). INC 
` 1 f + [n — Ds + n 
Var(e) t = j- EX (a) 
Bw =Ë waret 
y 
Var) - ł k" =k bee 
i ODE one 
EA) =p a ug. 


emye 


adu : EN jo Gum. 
. Ze e? d e/n 


E 353 


. 354 -` BAYESIAN APPROACH 


and - ` - 
; VE? 1 
8) = rr Fe = 04 
W=- =F P : 
Thus the prior parameters of the gamma distribution are k’ = 6.25; and »' = 12.5. 


` It follows then that the posterior distribution of » is also gamma. From the - 


relationships given in Table 8.1 between the prior and posterior statistics, and the 
sample data, we evaluate the parameters k" and »" of the posterior gamma distribu- 
tion as follows: ` À 

k" =k' +x = 6.25 +5 = 11.25 

vy =v +t= 125 + 9.2 =21.7 


Hence the updated mean and COV of the average flaw rate u are 


. k dL. 
E"(u) =7 = S - 0.52 flaw/m 
um E 1 
d'u) = —— = —— = 030 
"Wk  vV125^ 


8.5. CONCLUDING REMARKS . 


In the process of engineering planning and design, judgmental assump- 
- tions and inferential information are often useful and necessary. The 
significance of such prior-information and its role (in combination with 
observational data) in the process of estimation are formally the subject 
of Bayesian statistics. The basic concepts of the Bayesian approach have 
been introduced here with special reference to sampling and estimation. 
Applications of these concepts in Bayesian statistical decision will be 
covered in Vol. II. ; : : 
.Philosophically, there are fundamental differences between the Bayesian 
and classical statistics. Within the Bayesian context, a probability or a 


probability statement is an expression of the degree-of-belief, whereas in . 


the classical sense, probability is a verifiable measure of relative frequency. 
Furthermore, in estimation, the Bayesian approach assumes that a param- 
, eter is a random variable, whereas in the classical approach it is an un- 
known constant. ` 


Relative to engineering planning and design, the Bayesian approach . 


offers the following advantages: 


1. It provides the formal framework for incorporating engineering 
judgment (expressed in probability terms). with observational data. 

2. Itsystematieally combines uncertainties associated with randomness 
and those arising from errors of estimation and predietion (see 
Vol. II). : 

3. . It provides a formal procedure for systematic updating of information.- 


PROBLEMS 355 


PROBLEMS 


8.Í .Anewstructure is subjected to proof testing. Assume that the maximum proof 
load is specified at a reasonably high level so that the calculated probability of 
the structure surviving the maximum proof load is 0.90. However, it is felt 
that this calculation is.only 70% reliable, and there is a 25% chance that the 
true probability may be 0.50; moreover, there is everi a 5% chance that it may 
be only 0.10. X03 

(a) What is the expected probability of survival before the proof test? 

(b) If only one structure is proof-tested, and it survives the maximum proof 
load, determine the updated distribution of the survival probability. 

(c) What is the expected probability of survival after the proof test? 

(d) If three structures were proof-tested, and two'of the structures survived 
whereas one failed under the maximum proof load, determine the 
updated expected probability of survival. 


8.2 A new waste-treatment process has been developed. In order to evaluate its 
effectiveness, the treatmént process is installed for a trial period. Each day the 
output from the treatment process is inspected to see if it satisfies the specified 
standard. Suppose that the outputs between days are statistically independent, 
and there is a probability p that thé daily output will be acceptable. If the 
prior PMF is as shown in Fig. P8.2, determine the posterior distribution of p 
with each of the following observations. — . 

(a) The output on the first day of the trial period is of unacceptable quality. 
: (b) Fora three-day trial period, the quality is unacceptable in only one day. 
(c) For a three-day trial period, the first two days are satisfactory whereas 
the quality is unacceptable on the third day. 
In each case, determine also the Bayesian estimate for p. Ans, 0.536, 0.617, 
0.617. : 5 4 , 


P'tpi) 


- 04 . 06 0.8 | 40 P 
Figure P8.2 .. 


8.3 A hazardous street intersection has been improved by changing the geometric 
design to reduce the accident and fatality rates. For simplicity, assume that 
accident and fatality rates can be classified as high H or low L, leading to the 
following possible conditions: H 4H (high accident rate, high fatality rate), 
H Lp, L,Hy, and LåLp. Preliminary evaluation revealed that the relative 
likelihoods for these four conditions are 3:3:2:2. 

An accident rate prediction model, for example, the Tharp model, was used 
to obtain a better evaluation of the accident potential-at this (improved): 


356 


8.4 


3.5 - 


BAYESIAN APPROACH 


intersection. Because of possible inaccuracies in the prediction model, a 
predicted condition may not be actually realized. Furthermore, the prob- 


- ability of a correct prediction depends on the underlying actual condition, as 


indicated in the following table of conditional probabilities. 


Actual . 
Predicted ` H4Hp Hyly — L4Hg LaíLp 


Hg Hy 0.30 0.40 0.20 , 025 
Ha! Ly , 0.30 0.30 0.20 ` 0.25. 
L4! Ay , 00. 0.20 . 050. . 0.25 
La’ Ly 0.20 010 ` 0.10 ` . 0.25 


(a) What is the probability that the model will indicate H4H p? 

(b) Suppose that the model predicted H} y "p; what is the probability that 
the condition of the improved intersection will actually be I 4/5? 

(c) If the model predicted L'jL'g; what is the updated relative likelihoods 
of the four possible conditions? 


An instrument is used to check the accuracy of a set of measurements. How- 
ever, it can only record three readings, namely x — 1, 2, or 3. The reading 
x = 2 implies that the previously measured value is within a tolerable error, 
whereas x = 1 and x = 3 denote that the measurement is on the low and high 
side, respectively. Suppose the distribution of X-is given as follows: 


ian xsi‘ 
pxo)-i m ^ x-2 
LL x, =3 


where m is the parameter. For a particular set of measurements, the engineer 
estimated that the value of m would be 0.4 or 0.8 with equal likelihood. 
However, on checking a set of measurements, the first one indicates x = 2. 
(a) What should be the engineer's revised distribution of m? 
(b) Estimate the probability that at least two out of the next three measure- 
ments will be accurate. 
An engineer plans to build a log cabin in.the middle of a forest where logs of 
similar size are available, He assumes that the bending capacity M of each log 
follows a Rayleigh distribution. 


fum) = eomm mz0 ; 


"where the parameter 4 is the modal value of the distribution. 

From previous experience with similar logs, he feels that 4 would be > 4(kip-ft) 
with: probability 0.4 or 5 (kip-ft) with, probability 0.6.. Not entirely ‘satisfied 
with these subjective probabilities, he decided to get a; better measure:of the 
parameter 4. Being pressed for time and with limited supply of logs, hé can 


: only afford to test the bending capacity of two logs by simple load test on the 


site. The test results yielded 4.5 kip-ft and 5.2 kip-ft for the two tests: 


FRUBLEMNS doc 


(a) Determine the posterior distribution (discrete) for the parameter 4. 

(b) Derive the distribution of the bending capacity of the logs M, based on 
the posterior distribution of 4. 

(c) What is the probability that M is less than 2 kip-ft? 


'8.6 The absolute error E (in cm) of each measurement from a surveying instru- 


ment is governed by the triangular distribution shown in Fig. P8.6, where « 
denotes the upper limit of the error. 


tele) 


felo = 21-4) 


ce) a e, cm 


Figure P8.6 i . 


Two measurements were made and the errors are 1 and 2 cm, respectively. 
(a) Suppose that « is assumed to be 2 or 3 cm with equal likelihood prior to 
the two measurements; determine the updated distribution of «. 

Estimate the value of « based on this updated distribution. 
' (b) Now suppose that the prior density function of « is uniform between 2 
and 3; determine and plot the updated distribution of «, and evaluate 

` the corresponding Bayesian estimate for «. 

8.7 Suppose that the prior density function of the mean accident raie » in 


Example 8.2 is 


o= oz. 0.5 <v «20 


= "d elsewhere 


Determine the posterior density function of » based on the observation thatan 
accident was recorded during the first month of operation. 
8.8 In Problem 8.1 suppose that the survival\probability has a prior density 
function as follows. M ] 
' (i) Uniform between p'= 0 to 0.9. 


(ii) Uniform between p — 0.9 to 1.0. 
(iii) It is more likely that p will exceed 0.9 than be less than 0.9; the Telativé 


likelihood between these two possibilities is 7 to 3. 

(a) Determine the prior density function of p. ` 

(b) If three structures were proof tested, and all'three survived the maximum 
proof load, determine and plot the posterior density function of p, 

(c) What is the estimated value of p in light of the results in part (b)? 


8.9 Theoccurrence G fiel inacity may ‘be modeled by a Poisson process. Suppose 


9. Elements of Quality 
Assurance. and Acceptance 


Sampling 


"The assurance of product quality in manufacturing has long been a problem 
of industrial and production engineers. Quality assurance, however, is of 
concern to all engineers. Compliance with minimum standards of construc- 
tion and fabrication, and of quality in materials and workmanship, is 
necessary to ensure the design performance capability of an engineering 
system. For these purposes, acceptance criteria and acceptance sampling 
Programs are necessary. For example, in the construction of highway 
‘pavements, acceptance criteria are necessary to ensure compliance with 


construction specifications; similarly, in stream pollution control, ipee: ` 


tion plans are necessary to enforce-water quality standards. 

Probability concepts and statistical techniques are pertinent and useful 
to a variety of quality assurance problems. In this chapter we present and 
develop those statistical concepts that form the bases of some commonly 
used acceptance sampling programs. Such programs are of two types— 
sampling by attributes and sampling by variables. 


9.1. ACCEPTANCE SAMPLING BY ATTRIBUTES 


-When a lot of material of size N is submitted for inspection, a sample of n 
items may be selected at random fromthe lot and subjected to inspection 
and testing. In acceptance sampling by attributes, each of these n items is 
classified as good (acceptable) or bad (defective) after the test. It is a 
. common criterion that if more than r defective items are found from the 
sample of n, the lot will be rejected. Conversely, the lot will be accepted if 
there are r or less defectives. If among the lot of size N, the actual fraction 
` of defectives i is p, then the total number of defective items in the sample of 
Size n is described by the hypergeometric distribution (see Section 3.2. 9), 


. 360 


ys 


9.1. ACCEPTANCE SAMPLING BY ATTRIBUTES 361 
and the probability of accepting the lot is accordingly given by 
eas. us 
-A\Z n—2z o 
g(p = X (9) (9.1) 
z= : 
n 


where g = 1 — p. If n is small relative to N, it can be shown (Hald, 1952) 


that g(p) in Eq. 9.1 ean be approximated by 


g(p) c 27 W) PE f ` (9.2) 
which involves the binomial distribution. Equation.9.2 can also be written 
gp) -1- X. (") pu (9:22) 

n ; 


In this latter form, the term Džu qı @)p*q"* can, bé found in tables of . 
binomial probabilities, for example, Eisenhart (1950) or Aiken (1955). 


9.1. 1. The operating characteristic (OC) curve 

he function of Eq. 9.2 is referred to as the OC curve (operating 
Eran 2 Examples of the OC curve are shown in Fig. 9.1 for 
various sampling plans (with different combinations of n and r). It can be 
observed from each of the OC curves in Fig. 9.1 that as the fraction of 
defective items increases, the probability that the lot will be accepted 
decreases. For example, according to the sampling plan of Fig. 9.15, with 
"n = 15 andr = 1, there is less than 5% probability: that a lot with 24% 
defective items will.be accepted, whereas the probability of acceptance 
increases to 88% for a lot with 4% defective. From the appropriate OC 
curve, therefore, we can read off the probabilities of accepting and rej^cting 


lots containing various percentages of defective items. 


Generally, in determining the optimal inspection plan, it should be kept 
‘in mind that the plan has to be aceepted by both the supplier and receiver 
of the lot. ` 

e: .For the supplier, it.is desirable that the ‘plan have & low probability 

of rejecting a lot in which the actual fraction of defective items p is 
less than pi, the maximum fraction of defective units permitted in 
uality lots. 

a ae eat it is desirable that there be” a low probability of. 

' accepting a lot if p exceeds p» the minimum fraction of defective 

units sufficient to define poor quality lots. 


(e) n= 20 ` 


(a) n*1O 
16 bonas 
08, ] JH EN 
* ^1 33 
as} i 
4 
9(p) TR 
ZH | rZ 
a ii | 
2n IERI ES ~=; 
o2L—- EN 
r=0 


Han 
| O 00 008 iE ale dX. e Ges ake 
p 
Figure 9.1c and d Operating characteristic (OC) Curves. (c) n = 20. (d) n = 25. ` 


Figure 9.la and b. Operating characteristic (OC) Curves. (a) n. = 10. (b) n. 15. 


363 


.862 


364 ELEMENTS OF QUALITY ASSURANCE 


(e) n= 30 


Figure 9.le Operating-char&cteristie (OC) cuive n = 30. 


s. : Pi e 5. i 
The supplier's risk of rejectin i-quality k i 
ipplier’s risk of rejecting good-quality lots and the receiver’s risk of 
i prp: inferior-quality lots may be referred to as the producer's risk 
. d eo a sey ead The optimal inspection plan should 
ve values of n and r such that the correspondi ve wi 
: satisfy these risk levels. doy edd nin PRIUS 


EXAMPLE 9.1, 


In the construction of an earth embankment, the fill material is compacted to a 


- Specified CBR (California Bearing Ratio). Suppose that unsatisfactory performance - 


will result if more than 15% of the fill falls below i 
i n : I the specified CBR limit; h 1 
aped compaction that can be expected at the contract price would contain 
l Fa k si embankment falling below the specified CBR limit. Assume a 5% risk 
or both the producer and consumer; what values of 1 and r should be used ir th 
quality control program? : k : xs 
Using the approximation of Eq. 9.2, the conditions to be satisfied are 


t=0 
and 3 


` 80.15) = $ C) (0.15)(0.85)^-* = 0.05 


2 Seo (x, 


g0.01) = Y M (0.01)*(0.99)^-« = 0:95 


The required values of n and r ma’ rmi i 
r à y be determined by trial and error from th - 
Erg to id two simultaneous equations given above., Alternatively, we xum dud 
e available OC curves (such as Fig. 9.1) and seléct the appropriate one. It may be 
observed from Fig. 9.1e that n = 30 and r = 1 will suffice. 1 : 1 


9.1. ACCEPTANCE SAMPLING BY ATTRIBUTES 365 


EXAMPLE 9.2 


, The density of asphalt concrete in a roadway is to be inspected. Twenty Specimens 
(15-in. square each) were cored at random locations ofer a 5-mile stretch of the 


roadway. Laboratory tests show that only one of the specimens has a density below 


the specified limit. Suppose that the maximum permissible fraction of defective 
asphalt concrete is 15 % and it is desired to limit the risk of accepting inferior quality . 
material to 4%. Should the asphalt concrete be rejected? 

The acceptance function g(p) in Eq. 9.2 can be applied. In this case the criterion 


for accepting the asphalt concrete is 
; . 8(0.15) <,0.04 
Forn = 20 and r = 1, Eq. 9.2 gives 


g(0.15) ~ (2) (0.15)°(0.85)*° + " (9.15) (0.85) 


= 0.0388 + 0.1374 P 
. = 0.1762 : 
Since g(0.15) exceeds 0.04, the asphalt concrete should be rejected, In this case, for 


the roadway to be acceptable, all the 20 cored specimens must have at least the 
minimum specified density; because with r = 0, 


(0.15) ~ " (0.15)(0.85)9 


` = 0.0388 < 0.04 


EXAMPLE 9.3 ` Si 


Part of the probabilistic stream standard proposed by Loucks and Lynn (1966) 
requires that the probability of dissolved oxygen (DO) in a stream falling below 
4 mg/l in any one day should be less than 0.2. Supposethat the DO concentrations in 
the stream between days are independent-and identically distributed; how many days 
of measurements are required to achieve a 95 % confidence that this standard is met, 
that is, P(DO « 4 mg/l) < 0.2, for each of the following cases. t 


(i) The daily DO carinot be less than'4 mg/l during the period of measurement, 

Gi) The daily DO is allowed to be less than 4 mg/! at most once during the 
period of measurement. $ 

Let p denote the probability that the daily DO is less than 4 mg/l. The problem 

here is to determine so that the probability of rejecting a stream quality with p 
exceeding. 0.2 is 0.95, or- 4 

: g0.2) =1- 0.95 = 0.05 © : 
Since the lot size here is conceivably unlimited, we may apply Eq. 9.2. For case (5, 


.r = 0; hence f 
£(02) = (0.8)" = 0.05 - N 


obtaining . p 


= 134 = 14 days 


366 ELEMENTS OF QUALITY ASSURANCE 


For case (ii), r = 1; hence 
&(0.2) = (0.8)" + n(0.2)(0.8* = 0.05 


By gie nderron, we obtain the required period of measurement to be 22 days 
may be observed, from the above results, that a two-stage sequential sampling 


plan may be devised to achieve the same degree of i i i 
stream quality standard, as follows. i Mice in meeting ee 


Step 1 


Take 14 days of measurements; if the dail i i 
Pdf ly DO concentration during all 
rok m 4 mg/l, the standard is met; if the DO concentration falls prio 
} mg/l for two or more days, the standard is not met, whereas if the DO concentra- 
tion is below 4 mg/l on only one day, go to the second step. 


Step 2 
Continue taking measurements for another ei i 
king er eight days; if the DO c i 
Ph € days; O concentration 
d g/lin each of the eight days, the standard is met; otherwise, the standard 


9.1.2. The success run 


A type of quality control problem commonly encountered is as follows. 


A “batch” of material or manufactured products is submitted for in- . 


spection. Suppose that a sample of n specimens are picked-at random 
from the batch and each one is tested for compliance with a standard 
specification. If acceptance of the batch requires no failure out of the n 
“specimens tested, what should the value of n be in order to ensure a 
reliability R for the manufactured products with a confidence level C? 


This problem. can be solved b. lyi | ith : 
D m. car € y applying Eq. 9.2 with r equal to zero. A 
ean is dnce d E implies that the fraction defective is p = 1 — R: 
ence se H H "n: : H t * 
bs e probability of accepting a batch; whose reliability is R, is given 
; g1—R)-m (9.3) 


s n 
ince a confidence. level C is desired, the probability of acceptance should 
, 


Rh^-1—€ . 
from which we obtain : LE 
zs In(1— C) 
. Ink 
It may be emphasized that this is the minimum number of specimens that 


must be tested without any failures (that is, a success run of n), in order to 
assure that the reliability of the produet is R with confidence C. 


(8.5) 


9.1. ACCEPTANCE SAMPLING BY ATTRIBUTES 367 


EXAMPLE 9.4 


In a prestressed concrete reactor containment structure, 900 tendons are used in 
prestressing the concrete wall. After operating for a period of time, the level of 
prestressing in these tendons may decrease. Suppose that to assure proper perfor- 
mance of the structure, no more than 4.% of the prestressing tendons can be allowed 
to have less than the specified minimum prestressing force. Then, if a 95% con- 
fidence is desired, how many tendons must be tested without failures ? E 

In this case we have R = 0.96, and C = 0.95; thus Eq. 9.5 yields 


]n() — 0.95) 
e 74 
In 0.96 1 
Hence, in order to assure proper structural performance with a 95% confidence, 


74 tendons must be tested and'all must have prestressing forces equal to or above 
the specified minimum. é 


n= 


9.1.3. The average outgoing quality curve 

Another commonly used quality contròl sampling plan involves the AOQ 
(average òuigoing quality) curve; this is a plot of the expected fraction of 
defective units in the accepted product (after inspection) as a function of 
the assumed fraction of defective units in the as-submitted lot. In other 
words, the AOQ curve indicatés the degree, of protection offered by the 


„inspection program by providing information on the average quality of 


the accepted product. : : 
Consider an inspection plan where each defective unit is replaced by an 


acceptable one. If the lots submitted for inspection consist of (100 p)% 
defective units, the average fraction of these lots that will be accepted is 
given by g (p) of Eq. 9.2, whereas the average fraction that will be rejected 
is [1 — g(p) ]. Moreover, among the accepted lots, the fraction of defective 
units is p, whereas the lats that were rejected will presumably be screened. 
and returned as perfect products and thus will contain no defective units. 
Hence the average fraction of defective units in the final product is given by 


AOQ = pg(p) + 0[1 — g (P)], 


= pg(p) , (9.6) 


The AOQ curve corresponding to the OC curve of Fig. 9:1c (with n = 20, 
r = 1) is plotted in Fig. 9.2. It can be observed that when p is small, the . 
AOQ is also low as expected. However, the AOQ is algo low for high values 
of p; this is because, ini this case, the lots will have a high likelihood of being 
rejected and the defective units replaced with good ones before resub- 
mission. Therefore, at some moderate value of p (8% in the case of Fig. 
9.2), the AOQ will attain its maximum value; this value of AOQ is referred 
to as the AOQL (average outgoing quality Limit), denoting the maximum 


possible fraction of defective units in the resultant product associated with 


368 ELEMENTS OF QUALITY ASSURANCE 


AOQ 


002 


95 


11 ___| ig 
004 008 Ole [7:3 O20 


024 028 P 
- Figure 9.2. Average outgoing quality (AOQ) curve 


this quality control plan. In other words, using this acceptance inspection 
procedure, we can ensure that the fraction of defectives in the overall 
quality of the product will be less than the AOQL. | 


EXAMPLE 9.5 


In Example 9.1 where the quality of compacted. fill is being in: l 
nple 1alit ) spected, suppose: 
that the sampling plan on a section of the embankment requires CBR test at 30 ran- ^ 


dom locations, and the-section will be acce; i ` 

S, i pted if no more than one of these 30 tests 
Show substandard compaction. On the other hand, if there is more than one testy 
showing substandard compaction, the entire section will be recompacted (assume 
that recompacting will correct any substandard quality in the original fill). What is 


A0Q| 


0.03) 


902 


o O04 008 [17 O0 024. 028 T 


di2-- j : 
Figure E9.5 - AOQ curve for compacted fill of Example 9.5 


9.2. ACCEPTANCE SAMPLING BY VAKIABLES soy 


the resulting average quality of compacted fill as a function of the percentage of sub- 
standard compaction in the original fill? ' 7 v 
Let p denote the percentage of substandard.compacted fill. Apply Eq. 9.6 to the 
OC curve of Fig. 9.1e for n = 30 and r = 1. The resulting AOQ curve is plotted in 
Fig. E9.5. This curve gives the resulting average percentage of substandard com- , 
paction as a percentage of the substandard portion of the original fill. According 
. to this sampling plan, the worst quality of compaction would have. an average of 
2.7% of substandard compaction in the finalembankment. This will occur if the 
original embankment contains approximately 5% substandard compaction. 


9.2. ACCEPTANCE SAMPLING BY VARIABLES - 


. In sampling by attributes, as ‘presented in Section 9.1, the quality of each 
: item is classified simply as good or bad. With this procedure, one item could 
have better quality than another even though both of them are “good.” 
‘Tt would seem that the actual, measurements of the items tested should 
have some bearing on the quality of a lot. For this reason, a sampling plan’ 
that.is based simply on a good-or-bad classification for each item tested 
would not fully: utilize the information from the test results. An alternative 
‘procedure is sampling by variables, in which the values of the variable (a 
quality indicator) for each specimeh in a lot are measured. By comparing 
certain statistic(s) of the observed data, such as the s&ámple mean, with . 
“some standard value, the quality of the lot may. be determined. Since the 
measured data are. more fully utilized here than in sampling by attributes, 
: the sample size required to achieve ‘the same degree of quality control can 
be significantly smaller than that of sampling by attributes. EE 
In sampling by variables, the distribution of the sample statistic is 
required, and this depends on the distribution of the underlying random . 
variable. In practice, the Gaussian distribution is usually. assumed. In : 
many cases; such as the degree of compaction and material density, the 
~ -variable appears to.be Gaussian; in other cases, when the sample size is 
large, the distribution of the sample statistic (such as the mean value) is 
Gaussian by virtue of the central limit theorem. The following are some of 
the criteria used in aceeptance sampling by variables. ! 


"9.2.1; Average quality criterion, c known - 
In many quality control programs, such as inspection of vompaetion level, 
moisture content, and bulk density of construction materials, the as- 
surance of quality may be based on the average quality of the lot. Suppose 
that a mean-value ua for a lot constitutes good or acceptable quality (from 
the producer’s standpoint), whereas a mean value of m: for the lot corre- 
sponds to bad or unacceptable quality (from the consumer's position) "A 
common acceptance plan would be to-minimize the probability that a good 
lot (that is, with ua) will be rejected, and also minimize the probability that 


370 ELEMENTS OF QUALITY ASSURANCE ° 


Reject Accept 
Sample Mean Sample Mean 
Of Poor — Of Good — 
Quality Lot ` Quality Lot 


P. Fa ^x 


Figure 9.3 . Distribution of sample mean and corresponding risk measures 


a bad lot (with u,) will be accepted. Let œ and 8 denote the producer's and 
consumer's risks, respectively; then the required sample size » and. the 
standard mean-value L can be determined to satisfy these risks. 
` Consider first the case in which the standard deviation c of the-variable 
'.is' known in advance (perhaps from éxperience) and the variable. is 
Gaussian. For a sample of size.n, the sample mean will be Gaussian with 
standard deviation o/+/n. Then if a lot is of acceptable quality (its mean 
_ value is xa), the sample mean will have a distribution: N (us, v/ s/n). Now 
suppose that the lot will be rejected if.the sample mean z is less than L; 
then to limit the producer's risk to o; we should have ` . 


P(Š < L| p) = e5) = (9.7) 


Similarly, if the lotis of poor quality (thats, with mean-value g), the 
sample mean will be N (ii, c/ /) ; then to satisfy the consumer's risk, we 
should require : 


PR > La) =1-2(Z—#) =e ~ 8) 


Equations 9.7 and 9.8 are portrayed graphically in Fig. 9.3; solution of 


these two- equations then yield L and n for given values of Hay Bn @, and f. 
EXAMPLE 9.6 : 
The density of a newly completed airport pavement is to. be inspected by coring 


block specimens from the pavement. Suppose that from.past experience, the 
density of such pavements has. a standard deviation of 3%. The mean densities 


9.2. ACCEPTANCE SAMPLING BY VARIABLES . 3571 — 


of good and poor quality lots are 96 % and 92%, respectively. Determine the sampling } 
lan if both consumer's and producer's risks are taken at 5%. uw 
With the following substitutions: 


Ha = 0.96, m, = 0.92, « = 0.05, B. = 0.05, a = 0.03 


Eqs. 9.7 and 9.8 become, respectively, 


oc) = 0.05 


0.03/ Vn 
and 
1 - o(£ 22) = 0.05 
0.03/ Vn 
or " 
0.03 —0.049 - 
— 0.96 = —— $30.05) = —— 
L Va (0.05) Va 
un 0.03 0.049 
L —092 = —2 0-10.95) = —— 
Vn 0.35) Vn 


From which we obtain L = 0.94 and n c«6. Therefore the sampling plan here 
requires that six core specimens should be tested and the average density of the 
six specimens should be at least 94%, to ensure the density of the pavement. 


EXAMPLE 9.7 (Excerpted from Grandage, 1966) 


Cement is supplied in bags to a contractor. He is willing to accept cement with 
0.3% moisture (considered to be excessively moist for construction purposes) only 
10% of the time. However, the cement supplier demands that cement with 0.176 
moisture (considered to be excellent quality cement) should have a low probability 
of rejection, no more than, for example, 5%. S 


Assuming that these moisture contents are average values, we have 
p.m 01965; — n = 0.3% 
a = 0.05; B =0.1 


Assume further that o = 0.1%. A . 
In this case, a good cement would be rejected if its mean moisture from a sample 
exceeds the specified standard L; therefore we require ` 


5 L — 0.001 
P(X >L! m)=1 -«( 


and 


Tove) = 0.05 


whereas a poor cement would be accepted jf its sample mean moisture is less than 


: L; hence we also require E 


K | S (E — 0.008 
PG <L| a) = Of ) =o 


0.001/Vn 


372 ELEMENTS OF QUALITY ASSURANCE 


After simplification, we obtain 


^ Vm(L—000) .. _ 
EU = 010.95) = 1.64 
and 
vn, - 0003) = 
— wp = O80.) = -1.28 


The nearest integer value of n that satisfies these equations is 3, and the correspond- 
ing value of L is 0.21%. Therefore, if the mean moisture of three bags of cement 
‘exceeds 0.2176, the lot should be rejected; otherwise, the lot should be accepted. 


9.2.2. „Average quality criterion, c unknown 


Quite often e is not known in advance but has to be estimated from the 
measured data; in such cases the sample mean will no longer be Gaussian. 
Instead, the Student's t-distribution will apply (see Section 5.2.2). Then, 
in place of Eqs. 9.7 and 9.8, the two simultaneous equations needed to 
determine the required sample size and specified standard L are 


Mate EUREN (9.9) 
and 
Medie = dg (9.10) 


where ta,n-1 and ég,n-1 are the 100 (1 — a) and 100(1 — 8) percentile values 
of the t-distribution with’ (n — 1) degrees of freedom; and s is the sample 
standard deviation. Here tyjn—1 and tg. 4 depend on n; hence the solutions 
for L and n from Eqs. 9.9 and 9.10 would require trial-and-error procedure. 


EXAMPLE 98. 


In Example 9.7, if the standard deviation of the moisture content o is unknown, . 


the sample standard deviation s must be used. Suppose that the Simp data yielded 
s = 0.1%; then the constraint equations become 


Vink —00) , 
~~ O0 “~ *0.05,n—1 
. Vn(L — 0.003) _ E 
7. 000]  . (00. 0.1.n-l 


After a trial-and-error procedure, the required sample size z is found to be 4 and the 
corresponding value of Z, becomes 0.22%. 


: 9.2.3. Fraction defective criterion 


Sometimes the consumer is not so much concerned with the mean quality 
and variability of a lot, but rather with the fraction of the lot that is 
defective. 


9.2. ACCEPTANCE SAMPLING BY VARIABLES 373 


In a sampling plan to control the fraction defective in a lot, a sample of 
size n is selected at random and measured. The fraction defective p may 
then ‘be estimated from the sample mean and standard. deviation. The 
criterion is that if the estimated fraction is greater than a maximum’ 
allowable fraction M, the lot will be rejected. 

"Consider the ease in which the required-minimum value of a variable X is 
2o; then the fraction defective is simply the probability P(X < a»). Suppose 
that the distribution of X is normal with standard deviation o ; then 4 may 
be estimated as 


li 


P(X < m) 
Ej ( = *) (9.11) 


Cc 


$ 


where X.is the sample mean. 

Again, we define p, and pe respectively, as the &osiptablé and un- 
acceptable fractions of defectives in a lot. If the fraction defective in a lot i is 
Pa, the probability that an item will be defective is 


P(X <t) = pa 


à e = s) = pa 
. g 
That is, the mean value of X for a lot with p, fraction defective is given by 


Ha = % — oh" (pa) 


The event that this acceptable lot will be rejected is the same as # exceeding 
M. From Eq. 9.11, this probability is 


P > M) = P[e(223)» ul 


= P[X < m — o8 (M)] 


The sample mean X is Gaussian N (ua, 0//n) or N EZ — ed (pa), o/ val; 
hence, limiting the risk of rejecting a good lot to a, we have 


ay 787 (4) — in + 080.) ] 
P(f > M) -»[e- - UR - | 


= PEV nia (pa) — &3(4)]] = a 


From which | ; 
f Vin($7 (pa) — D-MY} = (a) (9.12) 
` Similarly, if the fraction defective in a lot is p, limiting the risk of 


374 ELEMENTS OF. QUALITY ASSURANCE 


accepting a bad lot to 8, we obtain 
vin (a (p) — &?(M)] = &1(1— B) (9.13) 
Solution of Eqs. 9.12 and 9.13 yields j 


_ [a —8) — $(e)T 
aO | (914) 
and 
M=6 [e de ze] (9.15) 


Therefore, in a sampling plan to control the fraction defective, the sample 
size n and tolerable fraction defective M are given by Eqs. 9.14 and 9.15. 
If c is unknown, acceptance sampling plans would be more difficult to 
derive. In such cases, the required sampling plans may be developed using 
Standard tables, such as those provided in MIL-STD-414 (1957). 


EXAMPLE 9.9 : 


One of the principal items under control for a large eatth dam embankment 
project is D, the ratio of fill dry density to laboratory standard maximum dry 
density. Suppose material of good-quality compaction requires that D exceeds 
0.99; otherwise, the fill will be rated as poorly compacted. The engineer estimates 

. that for satisfactory performance of the embankment, the. maximum tolerable 
fraction of poorly compacted fill should not exceed 8%, whereas a 3% poorly 
compacted fill is acceptable. Assume that the producer’s and consumer's risks are 
both 5%. Devise a sampling plan assuming that D has a Gaussian distribution. 

With 

« = (X05 B = 0.05 
and : 
Pa = 0.03  p,— 0.08 


we obtain the required sample size n from Eq. 9.14 as 


.. [97(0.95) — 9-(0.05) 
T E = $635 | 
 [ 1.64 —(-1.64) 7? 
i, (eu = cias] 


=47 
The corresponding tolerable fraction defective M is obtained from Eq. 9.15 as 
(0.05) 
|: M = 9) 070.03) — ——— 
[ a v4 ] 
(—1.64) 
- = | (—1.88) — = 
E : va 
= 0(—1.64) 
= 0.05 


9.3. MULTIPLE-STAGE SAMPLING 375 


Suppose that the sample mean of D from 47 specimens is computed-to be 1.005; 
would this section of embankment be acceptable? Assume that the standard devia- 
tion of D, namely c, is 0.01. 


With the mean value of D estimated to be 1.005, the probability that a fill specimen 
will be of poor quality is given by $ 


0.99 — 1.005 
P(D < 0.99) -0 (Po) 


= 0(—1.5) 
= 0.036 


Hence the estimated fraction defective is 0.036 based on the observed data. Since 
this is less than 0.05, this section of the embankment is acceptable. 


EXAMPLE 9.10 


. In Example 9.3 it is specified that the probability p of the daily DO concentration 


in the stream falling below 4 mg/l is 0.2 with a 95% confidence. In other words, 


` if p = 02, the probability that the stream quality will be unacceptable is 95%, 


or the risk of accepting poor-quality stream is 5%. Hence f = 0.05 and p, = 02. 
Since only consumer’s risk is involved in this case, Eq. 9.14 may be used (assuming 
the daily DO is Gaussian) to determine the required period of measurement (in 
days). Suppose that the daily DO is allowed to fall below 4 mg/l at most once 
during the measurement period; then the stream quality will not meet the standard 
if 6, estimated from z observed measurements according to Eq. 9.11, exceeds 1/n. 
Thus M = 1/n in this. case. Equation 9.14 then yields 


a ©(0.95) 

~ 70,2) — An) 

5 " T:64 

^ —0.84 — à31/n) 
By trial and error, the solution to the foregoing equation gives n = 11. Therefore 
a period of 11 days of measurement is required; if the measured daily DO does not 
fall below 4 mg/l more than once during this period, the stream quality satisfies 
the DO standard with 95% confidence. 

This result may be compared with the period of 22 days based on sampling by 

attributes in Example 9.3. Assuming that DO has a normal distribution, the 


sampling by variables therefore reduces the required period of observations to half 
that of sampling by attributes. 


9.3. MULTIPLE-STAGE SAMPLING 


In addition to the single-stage sampling plan discussed above, multiple- 
stage sampling is sometimes used in inspection programs. An example of this 
has been discussed in Example 9.3. The objective is to accept the uns 
questionably high-quality products and to reject the unquestionably low- 
quality products based on relatively small samples taken at the first stage, 
The lots with questionable quality are subjected to a second-stage sampling. 
Based on the combined samples from the two stages, these lota will be 


-376 | ELEMENTS OF QUALITY ASSURANCE 


accepted or rejected, or be subjected to a third-stage sampling. This 
process may be continued to several sampling stages: The risks (producer’s 
and eonsumer's) involved in the multiple-stage sampling can be made 
equivalent to that in the single-stage sampling, provided the OC curves are 
the same in each sampling plan through appropriate choice of sample sizes 
and acceptance-rejection ‘criteria. The reader is referred to Lipson and 
Sheth (1973) and Hald (1952) for a further treatment of this subject. 


9.4. CONCLUDING REMARKS 


In this concluding. chapter, we have introduced the elementary concepts 
.pertinent to'the development of programs for quality assurance and ac- 
ceptance sampling. Acceptance sampling programs or plans are of two 
general types—sampling by attributes and sampling by variables. In all cases, 
an important consideration in the development of a program is the resolu- 
tion of the conflicting interests of the producer and consumer. 


The application of the concepts introduced here is facilitated greatly by ` 


the availability of pertinent tables and charts, such as the tables of binomial 
probabilities (Eisenhart, 1950; Aiken, 1955). Moreover, tables corre- 
sponding to standard sampling plans have been developed for both ac- 
ceptance sampling by attributes and sampling by variables. For example, 
the military standard MIL-STD-414 (1957) and MIL-STD-105D (1963) 


provide charts giving the required sample size and the acceptance-rejection . 


eritefia for both single and multiple sampling plans once the lot size, the 
tightness of the inspection level (as defined by the level of producer’s and 
consumer’s risks), and the tolerable fraction of defectives have been 
Specified. 


PROBLEMS 


9.1 (a) A large set of welds is submitted for inspection. One out of 20 welds , 


inspected was found to contain flaws. Suppose that the workmanship would 
be unacceptable if more than 10% of the welds contain flaws. Should the 
welds be accepted if the risk of accepting welds with unsatisfactory 
‘workmanship is limited to 5%? . 

(b) Suppose further that good workmanship is defined as no more than 3% 
of the welds contain flaws. If the risk of rejecting welds with good work- 
manship is limited to 10%, devise a sampling plan that will satisfy the 
producer's and consumer's risk requirements, Use any pertinent data of 
part (2) as necessary. : . » 

c) Consider a sampling plan that requires inspection of 25 welds and at 

» least 24 welds must be flawless for acceptance, Those welds that are 
rejected are required to be repaired. Determine the AOQ curve and AOQL 
corresponding to this sampling plan. : 


9.2 Sulphur dioxide is a main source of air pollution in a major city. Suppose that 
' a concentration of SO, less that 0.1 unit (for example, paris per million parts 


PROBLEMS 377 


of air) is harmless to human beings. If it is desired to maintain this condition 
for at least 9075 of the time, what minimum number of daily measurements 
of SO, concentration is required to assure the desired air quality with 95% 


confidence? Assume that SO, concentration between days are statistically 


9.3 


9.4 


9.5 


9.6 


9.7 


independent. Ans. 29. 


(a) Soft lenses of sand deposits are hazardous to foundation safety during 
earthquakes, Soil borings is one way of detecting the presence of such 
lenses. Record of ten borings made'at random locations over a large 
building site shows no signs of soft lenses of sand deposit. What is the 
-confidence (probability) that sand lenses would not be found in more than 
15% of the area beneath the site? Ans. 0.803. EE 

(b) Suppose that the engineer would like to have a 99% confidence that sand 
lenses would occupy less than 15% of the site area. How many additional 


borings should.be made? Ans, 19. 


Individual diesel engines used for generating emergency electrical power must 
have a high reliability of starting during an emergency. If a reliability (to start) 
of 99% is required, how many consecutive successful starts would be necessary 
to ensure this reliability with a 95% confidence? : 


Ready-mix concrete is supplied to a building site for construction, In order 
to ensure that the concrete meets the specified strength requirement, specimens 
of the ready-mix concrete. are subjected to compression tests. Suppose that a 
batch containing 5% or less under-strength concrete is regarded as good- 
quality, whereas that containing 20% or more understrength concrete is 
deemed unsatisfactory. Devise a sampling plan for assuring quality concrete, 
if the producer’s and consumer’s risks are 10% and 6%, respectively. Ans. 
n=30;r = 3, 


Structural-grade lumber is used for falsework in a construction project. 
Lumber of a given dimension is shipped in truck loads. Suppose that five 
pieces per truck load are selected at random and subjected to examination and 


- test for quality assurance. Acceptance of a load requires all five pieces to be 


nondefective. 

(a) If the allowable fraction defective can be as high as 30%, what is the 
consumer's risk? 

(b) The supplier here is not expected to deliver structural grade wood with 
less than 10% defective. If the fraction defective in a truck load of 
structural grade lumber is actually 1075, what is the probability that a 
truck load will be rejected? 


In àn earthdam construction project, the density of the compacted fill is 
measured at random locations to insure the Specified degree of compaction. 
Suppose that the acceptance criterion requires that'the average dry density 
from,10 locations should be at least 118 Ib/ft?. Assume that 126 and 114 Ib/ft3 
correspond, respectively, to the mean densities of good and poor-quality 
compaction. Assume also that the density at various locations varies and may 
„be assumed to be Gaussian, with a standard deviation o of 4 Ib/ft3, 
(a) What are. the consumer’s and producer’s risks associated with this 
sampling plan? H 
(b) If the consumer's and producer’s risks are specified at 5% and 10%, 
respectively, devise the appropriate sampling plan. 
(c) Repeat part (b), assuming that a is unknown, but expected to be around 
A Ib/ft*. 


lb/ft 


0.691403 
0.694975 
0.698408 


500000 


. 503989 
-507978 
-911966 
0.04 0.515954 


Z2Zii 
© 


0.994457 
~ 0.994614 


Heme 
Sh: 
S 


0.982136 


0.995339 
0.995473 
0.995604 
0.995731 
0.995855 


S 


Bt 0.872857 
0.984223 
0.984614 


-15 0.874028 
B 0.876976 
0.878999 
.18 — 0.881000 
-19 — 0.882977 


so 0522590 occse cc 


ec 


68 — 0.996319 
j 0.996427 


f | - : 
i i 1 j £s N . j 
Table A.l. PS of Standard Normal Probability $(z) = VE f exp (—42) Table A.l. (Cont'd) 
eS - 

0.994760 

9T 0.994915 

ü6 0.859929 21 58 0.995000 

09 0.862143 0.981691 59 0.995201 

1 


WRENN VYP PHN IO FO. IM POR IO NNNNYN IO I2 I IO 
d Id z d 


| 0. 0.884930 0.986097 0.996533 
| 0.7 0.886860 0.986447 0.990636 
0. 0.888767 0.986791 0.990736 
| 0. 0.890651 0.087126 0.996833 
0.7 0.892512 0.087455 0.906928 
i . 0 1.25 — 0.804350 E 0.987776 0.997020 
| 8602508 0 1.90 0.896165 1. 0.988089 0.997110 
| 0.600420 0 1.27 0.897958 i 0.988306 77 — 0.997107 
| 0.010262 0 1.28 7 1. 0.988096 78. 0. 
| „0.614092 9 1.99 0.904055. ^ m 0.988089 79 0. 
0.617912 0.7 1.30 — 0.903199 1.80 0.964070 80. 0. 
0. .1.81 0.904902 1.81 0.964852 . 8 0 
i ô. 1.32 0.900583 1.82 0.980830 82 0. 
0.83 0. 1.33 — 0.908241 1.83 0.990097 83 o 
0.84 0.799546. 1.84 — 0.909877 1.84 0.990338 84 0. 
0.85 0.802337 1.85 — 0.911492 1.85 0.990013 2.88 0 
0.86 0.805105 1.36 0.913085 1.86 0.990863 2.86 
, 0.87 0.807850 1.37 0.914056 1.87. 0.991106 2.87 
0.88 0.810570 1.38 — 0.910207 1.88 0.991344 2.88 
0.89 0.813207 1.39 0.917735 1.89 0.991576 2.890 0.998074 
0.90 0.813940 1.40 — 0.919243 1.90 0.991802 2.90 — 0.998134 
0.91 0.818589 1.40 — 0.920730 1.91 024 2.91 — 0.098193 
0.92 0.821214 1.42 0.922195 1.92 3.92  . 0.998250 
0.93 0.823815 1.43 0.923641 1.93 2.93 0.998305 
0.94 0.826391 1.44 0.923066 1.94 2.94 — 0.998359 
0.95 0.828044 1.45 — 0,920171 1.95 9.95 0.998412 
0.96 0.831473 . 1.46 — 0.927858 1.96 2.00 0.998442 
0.97 0.833977 1.47 0.929219 1.97 . x 2.97 — 0.998511 
0.98 0.836457 1.48 0.930563 ` 1.98 j à s 5 2.08 0.998559 
0.99 0.838913 ^" 1.49 — 0.931888 . 1.90 4 mE ED 613 2:90 0.998605 
380 : 381 


Table A.2. p-Percentile Values of the t-Distribution* (After 
$e eco xo. Bi)” z 1-— (2) <a 0.750 0.900 . 0.950 0.975 0.990 0.999 
0.908650 0.999767 4.00 — 0.316712E-04 1 - 
0.908694 0.909776 4.05 
kii Ü i 1 1.000 3.078 6.314 12.706. -31.821 63.657 318 
MU ree bor 3 0.816 1.886 2.920 74.803 © 6.965 ' 9.995 22.3 
0.998817. 0.999800 4.20 — 0.13 3 0.765 1.638 2.858 3.182 4.541 ‘5.841 10.2 
0.008836 0.999807. i35. 010888580 4 0.741 1.588 2.132 © 2.776 3.747 4.604 7.178: 
0.008803 0.999815 4.30  0.853900E-05 5 0.727 1.476 2.015 2.571 3.365 . 4.032 5.893 
0.098930 0.999821 4.35 — 0.680688E-05 ] 
0.998905 0.090828 3.40 . 0.541254E-05 6 0.718 1.440 1.943 2.447 ` 3.143 3.707 5:208 
0.998909 0.909835 4.45 0.429351E-05 7 0.711 1.415 1.895 ^ 2.365 2.908 3.499 . 4.785 
0.999032 0.990841 4.50 8 0.706 1.897 1.860 2.306 | 2.806 3.355 4.501 
NE SEE m 9 0.703 1.383 1.833 2.262 2.821 3.250 4.297 
E .09085: 7 ` 
0.099126. 0.909858 4.05 — 0.105008E-05 10 0.700 .1.872 1.812 2.228. 2.764 — 3.109 4.144 
0.090155 0.000861 4.70 — 0. 130081E-05 11 0.697 1.368 1.796 2.20 2.718 3.106 4.025 
H 0. 101708E-05 j 12 0.695 1.350 1.782 2.179 2.681 3.055 3.930 
"abosrt i (s E 13 0.694 1.350 1.771 2.160 2.650 . 3.012 3.852 
0.000883 4 + 0.479183E-00 is l 14 0.692 1.345 1.761 2.145 2.624 2.977 3.787 
1.000280 ME 4 0.37 1067E-06 15 0.691 1.341 1.753 :2.131 — 2.602 2.047 3.733 
Mone OMA ame rit eer | 16 0.690 1.887 1.746 2.120 2.583 2.921 3.686 
0.909000. — 5 0.0063431-07 j 17 0.689 1.333 1.740 2.110 2.507 2.898 3.646 
0.00004 0.570037 18 0.688 1.830 1.784 2.1001 2.552 2.878 3.610 
vaio: MN ee NM" 19 0.688 1.328 1.720 ^ 2.003 2.539 2.861 83.579 
0,900423 0.900912 '0. 180806E-07 . 20 0.687 1.828 1.725 :2.086 2.528 2.845 3.552- 
0.000443 0.000015 0. 107176E-07 i : . 
v, Minna 0.200018 Dow 21 0.686 1.3828 1.721 2.080 2.518 - 2.881 3.527 
Hee HA 22 — 0.680 1.821 1.717 2.074 2.508 2.810 3.505 
no H 23 0.085. 1.319 1.714 ^ 2.069 2.500 2.807 3.485 
0099931 "d E 0.685: 1.818 1.711 ^ 2.064 2.492 2.797 3.467 
0.990 T 25 0.684 1.316 1.708 2.060 2.485 2.787 3.450 
0.0090! 2 5 E R 
0. pnovas 0.77688 E- 10 ` 26 0.684 1.815 1.706 2.056 2.419 2.779 3.435 
Gaai ) a 0.400 Y-10 27 0.684 1.814 1.708 2.052 2.478 2.771 3.421 
0.000043 5 X 28 0.683: 1.813 1.701 2.048 2.467 2.763 3.408 
29 0.683 1.311 1.699 2.045 2.462 2.756 3.396 
30 0.083 1.310 1.697 2.042 -2.457 2.750 3.385 ` 
00 40 0.681 1.808 1.084 2.021  .2.423 2.704 3.307 
i T 60 0.070 1.206 — 1.671 2.000 .2.390 2.660 -3.232 
30 ^ 120 0.677 1.289 1.658 ` 1.980 2.358 . 2.617 3.160 
0.000959 40 uL. 0.674 1.282 1.645 1.960 -2.326 2.576 3.090 


0.999961 
- * Abridged from Table 12 of Biometrika- Tables for Statisticians, vol. I, edited by 
E. S. Pearson and H, O. Hartley, Cambridge University Press, Cambridge (1954), and 
Table III of Statistical Tables for Biological, Agricultural, and Medical Research, R. A. 
Fisher and F. Yates, Oliver & Boyd, Edinburgh, 1953. : 


7. 
7. 
7. 
7. 
T. 
7.30 
T. 
m 
T. 
7. 


382 - 383 


Table A.3. a-Percentile Values of the x? Distribution* (After Brownlee, 1960) 


0.025 0.050 0.900 0.950 0.975 0.990 0.995 0.999- 


0.04393. 0.03982 0.02393 2.71 5.02 6.63 7.88 10.8 
0.0100 0.0506 0.103 . 4.61 7.88 9.21 10.6 13.8 
0.0717 0.216 , 0.352 6.25 9.35 11.3 12.8 16.3 


84 
99 
81 
0.207 0.484 0.711 7.18 49 11.1 13.3 14.9 18.5 
1 
6 
1 
5 
9 
3 


0.676 1.24. . 1.64 ° 10.6 14.4 16.8 18.5 22.5 
16.0 .18.5 20.3 24.3 
17.5 120.1 22.0 26.1 


1 

2 

3 

4 

5 0.412 0.831 1.15 9.24 1. 12.8 15.1 16.7 20.5 
6 H 

7 0.989 1.69 2.17 12.0 1 

8 1.84 2.18 2.73 18.4 15. 

9 1.738 2.70 3.33 14.7 1 19.0. 21.7 .23.6 27.9 
I 


10 2.16 3.25 3.94 16.0 20.5 : 23.2 25.2 29.6 


11: 2.60 38.82 . 4.57 17.8 19.7 21.9 24.7 26.8 31.3 
12 3.07 4.40 5.23 18.5 21.0: 23.3 26.2 28.3 32.9 
18 3.57; 5.01 5.89 19.8 22.4 24.7 27.7 29.8 34.5 
14 4.07 5.63 6.57 21.1 923.7 26.1 29.1 31.3 36.1 
15 4.60 6.26 7.26 22.8 25.0 27.5 30.6 | 32.8 37.7 


.16 5.14 6.91 7.96 23.5 26.38 28.8 32.0 34.3 © 39.3 
. 17. 5.70 7.56 8.607. 24.8 27.6 30.2 33.4 35.7 40.8 
-18` 6.26 8.23 . 9.390 26.0 28.9 31.5 34.8 37.2 42.3 
19 6.84 8.91 10.1 27.2 30.1 32.9 36.2 38.6 43.8 
20 7.43 9.59 10.9 28.4 31.4 34.2 37.6 40.0 45.3 
29.6 
8 


21 8.08 10.8 . 11.6 32.7 35.5 38.9 41.4 .46.8 
22 8.64 11.0. — 12.8 30. 33.0 36.8 40.3 42.8 ` 48.3 
23.. 9.26 11.7 13.1 32.0. 35.2 38.1 41.6 44.2 49.7 
"24. 9.89 12.4 18.8 , 33.2 36.4 39.4 43.0 45.6 “51.2 


25 10.5 13.1 14.6 34.4 37.7 -40.6 44.3 46.9 52.6 
26 11.2 13.8 , 15.4 35.6 38.9 41.9. 45.0 48.3 54.1 
27 11.8 14.6 16.2 36.7 40.1 43.2 47.0 . 49.6 55.5 
28.12.5 15.3 16.9 37.9 41.3 44.5 48.3 51.0 56.9 
29 13.1 16.0 17.7 30.1 42.6 45.7 49.6 52.3 58.3 
30 13.8 16.8 18.5 40.3 43.8 47.0 50.9 53.7 59.7 
35 17.2 20.6 22.5 _: 46.1 49.8 53.2 .8 60.3 66.6 

7 

0 


. 57 
40 20.7 24.4 26.5 51.8 55.8 59.3 63. 
: 45 24.8 ,.— 28.4 30.6 57.5 -61.7. 65.4 70 
50 28.0 32.4 34.8 | 
75 47.2 52:9 56.1-- -91.1 96.2 -100.8 106.4. 110.3 118.6 
100 67.3 74.2 77.9: 118.5 124.3 129.6 135.8 140.2 149.4 


73.2 80.1 


* Abridged from Table V of Statistical Tables and Formulas by A. Hald, John Wiley & 
Sons, New York, New York (1952). 5 . ` 


384 


66.8 . 73.47 


63.2 67.5 71.4. 76.2 79.5 86.7 


taeae ae maaamo 


Table A.4. Critical Values of D,* in the Kolmogorov-Smirnov Test (After 
Hoel, 1962) 


a 0.20 0.10 0.05 0.01 
á : 
5 0.45 0.51 0:56 0.67 
10 0.32 0.37 0.41 0.49 
15 0.27 0.30 , 0.84 0.40 | 
20 0.23 0.26 0.29 0.36 
25 0.21 0.24 0.27 0.32 
30 0.19 -0.22 0.24. 0.29 
85 ;.0.18 0.20 0.23; 0.27 
“40 0.17 0.19 0.21 0.25 
45 0.16 0.18 0.20 0.24 
50 0.15 0.17 0.19 0.23 
>50 1.07/4/5 1.22/4/n 1.36//n 1.63//n 


385 


Appendix B 


Combinatorial Formulas 


In probability problems involving discrete and finite sample spaces, the 
definition of events and the underlying sample space entails the enumera- 
tion of sets or subsets of sample points. For this purpose, the techniques of 
combinatorial analysis are often useful. We summarize here the basic ele- 
ments of combinatorial analysis. 


B.1 THE BASIC RELATION 


If there are k positions in a sequence, and m distinguishable ‘elements 
can occupy position 1, na can occupy: position 2,...and m, can occupy 
position k, the number of distinct sequences of k elements each is given by 


N (b | tay ma «<< 5 te) Pete BA) 


Examples 


(a) -If a design involves three parameters di, de, ó: and there are, ` 


respectively, 2, 3, 4 values of these parameters, the number of feasible 
designs is 
N(8|2,8,4) — (2) (3) (4) — 24 
(b) In a three-dimensional Cartesian coordinate system 2, y, z, if 10 
discrete values are specified for each of the axes, for example, z = 0, 1, 
2,,..,9, then the total number of coordinate positions is 


NB | 10, 10, 10) = (10)* = 1000 


B.2 ORDERED SEQUENCES 


In a set of n distinct elements, the number of k-element ordered sequences, 
or arrangements, is 


nl 


M) = n(n — 1) (n — 2)--«(n — k 4- 1) “Gopi 


(B.2) 


386 


| 5.3. THE BINUMIAL CUEFFICIENT 387 


For the first position in the sequence of k elements, there are n elements 
available to oceupy it; but there are only (n — 1) elements available for 
the second position sincé one of the n has been used for the first position, 
and only (n — 2) elements available for the third position, and so on, 

Thence, by virtue of the basic relation, Eq. B.1,. 


(ye NG Inn. ean m kl) 


^ thus óbtaining Eq. B.2. 


Examples 


(a) If no digits are repeated, the number of four-digit figures is (10), = 
(10) (9).(8) (7) = 5040, whereas, if the digits can be repeated the number 
of four-digit figures would be (10)* - = 10,000—this latter ease would 


- include 0000 as one of the figures. 


(b) In taking samples ‘sequentially from a discrete sample space 


` (population), the sampling:may be done either with replacements or without. 


replacements; that is, when an element is drawn from the population it is 
either returned or not returned to the population before another element is 
drawn. In a population of n elements, the number of ordered samples of 
size r then is n for sampling with replacements, whereas in sampling with- 
out replacements, the corresponding number of ordered samples is (n). ` 


B.3 THE BINOMIAL "COEFFICIENT 


In & set of. distinguishable elements, the number of possible subsets of. 
k different elements.each (regardless of order) is given by the. binomial 


coefficient 
(me. ` 
G- o ie» 


It may hé emphasized that in Eq. B.2 the-ordering of the-k elements is of ' 
significance (that is, different orderings of. the same elements constitute | 
different sequences, or arrangements), whereas in Eq. B.3 the, order is 
irrelevant. In a set, of k elements, the position of the elements can be 
permuted k! times; hence, by virtue of Eq. B.2, we obtain Eq. B.3 as the 
number of different k-element subsets (disregarding order). 

Equation B.3 is defined only for k < n. Usi y, Eq. B.2 for (n),, we have 


` () s Bm (B3a). 


from which it is clear that 


Equation B.3 or B.3a is known as.the binomial. coefficient because this is 
- precisely the coefficient in the binomial expansion of (z + y)"; namely, 


æ+ = ( Jer + i ): ay + (eme s + (er. 


Examples 


(a) Froma population of elements, the number of different samples 
of size 7 is (7). "This is the same as sampling without replacement, except 
that the order is disregarded; therefore the number is 


oe): 


. (b) Among 25 concrete cylinders marked 1, 2,5 y 
possiblé samples of five cylinders each'is 


95V 25! » 
@) ~ 5120! 53,180 


25; the number of 


B. THE MULTINOMIAL COEFFICIENT 


If n distinguishable - elements are divided into r eure groups of 
ku, ke, «©. , ky elements, respectively, so that ki + kz + +++ + k= n, the 
number of ways to form such r groups is given by the multinomial coe, eficient 


/ n n! EE 
(s. kay... yk ME Aye. eM c. e) 


Among the 7 elements, the first group of hy elements can be chosen in Ga) 
ways. The second group of kz elements can be chosen from the rémaining - 
(n — ki) elements in [us y ways, and so on. The total number' of ways ~ 
of dividing then elements into r groups of ki, ka... , kr, therefore, is 


st HM V(b) = n! 
BJ h 7. kn c0 SNe kll... kt 


OED 0 sae. 


B.5. STIRLING’S FORMULA 389 


Example 


In a given region, six earthquakes of intensities V, VI, ‘VII may occur in 
the next 10 years. Three eatthquakes of intensity V, two of VI, and one of 
Vit ean occur in 

6! à ff 
zaii 60 diffèrent sequences 


B.5 STIRLING'S FORMULA 


An important formula for computing the factorial of large. numbers 
approximately is the, Stirling’s formula: 


nia Vie (n) preis l (B.6) 


A proof of the formula can be found in Feller (1957). The approximation 
is good even for n as small as 10 (error less than 1%). 


Appendix C 

Derivation of the 

Poisson Distribution 

The Poisson distribution describes the probability mass function for the 


number of occurrences of an event within a specified interval of time or 
space. It is the result of an underlying counting proċess X (t), known as à 


Poisson process, which is a model of the random occurrences. of an event: 


in time (or space) t. 
- The Poisson process model is based on the following assumptions. 


l. Atany instant of time (or point in space), there can be at most one 
occurrence of an event; in other wérds, the probability of n occur- 
rences of the event over a small interval At is of order o (At). 

2. The, occurrences of an event in nonoverlapping time (or space) 

` intervals are „statistically. independent; this is the assumption of 
independent incremenis.. ` 

3. The probability of an occurrence in (t, é+ At) is proportional to 

_ At; that is, 


P[X(A) = 1] = »At 


where v is a positive proportionality constant. 
*.On the basis of assumption 2, we obtain, with the theorem of total 
probability, 


s (t+ AD = oe P[X (t) = x] P(X (At) = 0] 
+ P[X() = z — 1] PLX (AD) = 1] 
--P[X( = z — 2] PLX (At) =2] 


. rnm . 
Then on' the basis of assumptions 1 and 3, wo obtain, using the notation 
p(t) = P[X(0) = zl 


pelt + Ad), = [I — vAt — o(At)? — -- Je. (0 T (ate (0: 
+ o (At) palt) +- 
390 : ? 


APPENDIX C. DERIVATION OF THE POISSON DISTRIBUTION 391 


Neglecting the higher-order terms, the preceding equation becomes 


Palt + at) — palt) 


Ai = —vpz(t) + vp 


Therefore, in the limit as Ai— 0, we obtain the following diferential 
equation for p+ (t) : 


` dps(t) 


à —vpz(t) + vp.a (t) : . (C1) 


It should be recognized that Eq. C.1 applies for any z > 1. For z = 0, the | 
preceding derivation leads to the following: 


dpl) i 


E —vpo(t) (C.2) 


“If the counting process starts from zero, the initial conditions associated 


with Eqs. C.1 and C.2 are 
$(0)-10 and p(0-0' 


The solution of Eq. C.2 with the first of the initial conditions yields, 
for z = 0, 


po(t) = et 
For x > 1, the solutions to Eq. C.1 are 
pilt) = view 
É 
pal) = SE es 
“and for a general x, 
i i 
p.) = x " e (C3) 


Equation C.3 is the Poisson probability mass function, in which the 
parameter v is the mean occurrence rate of the event. 


References 


1. 


2. 


13. 
392 


Aiken, H. H., Tables of the Cumulative Binomial Probability Dis- 
tribution, Harvard Univ. Press, Cambridge, Mass., 1955. 

Allen, D. E., “Statistical Study of the Mechanical Properties of 
Reinforcing Bars,” Building Research Note No: 85, National Research 
Council, Ottawa, Canada, April 1972. 

Ang, A. H-S., “Structural Risk Analysis and Reliability-Based 
Design," Jour. of the Structural Division, ASCE, Vol. 99, No. ST9, 


. September 1973, pp. 1891-1910. - . " 
` Bachmann, W. K., "Estimation Stochastique de la Précision des 


Mesures," Mensuration Photogfammétrie Génie Rural, Vol. 71, 
December 1973, pp. 107-118. 

Bagnold, R. A., "Interim Report on Wave Pressure Research," Jour. 
Institution of Civil Engineers, London, England, Vol. 12, June 1939, 
pp. 202-206. 


` Barry, B. A., Engineering Measurements, J. Wiley and Sons, New 


York, 1964. 

Barsom, J. M., . “Relationships Between Plane-Strain Ductility and 
Kr, for Various Steel,” ASME 1st National Congress on Pressure 
Vessels and Piping, Paper No. 71-PVP-18, May 1971. 

Benjamin; J. R., “Probabilistic Models for Seismic Force Design,” 
Jour. of the Structural Division, ASCE, Vol. 94, No. ST5, May 1968, 
pp. 1175-1196. 

Brownlee, K. A.; Statistical Theory and Methodology in » Sciénee and 
Engineering, J. Wiley and Sons, New York, 1960. 

Bureau of Public Works, “Traffic Assignment and Distribution for 
Small Urban Areas,” U.S. Dept. of Commerce, September 1965. 
Butts, T. A., Schnepper, D. H., and Evans, R. L., “Statistical Assess- 
ment of DO in Navigation Pool,” Jour. of the Sanitary Engineering 
Division, ASCE, Vol. 96, No. SA2, April 1970, p. 48. $ 


Cartwright, D. E., and Longuet-Higgins, M. S., “The Statistical ` 


Distribution of the Maxima of a Random Function,” Proc. Royal 
Society, Series A, Vol. 237, 1956. . 


Clopper, C. J., and Pearson, E. S., “The Use of Confidence or Fiducial 


M. 


15. 


16. 


17. 


26. 


27. 


REFERENCES : 393 


Limits illustrated i in the Case of the Binomial, ” Biometrica, Vol. 26, 
1934, pp. 404-413. 

Cornell, C. A., “A Normative Second-Moment Reliability Theory for 
Structural Design," Solid Mechanies Division, Univ. of Waterloo, 
Waterloo, Ontario, Canada, 1969. 

Cox, E. A., “Information Needs for Controlling Equipment Costs,” 
Highway Research Record, No. 278, Highway Research Board, Na- 
tional Research Council, 1969, pp. 35-48. 

Cusens, A. R., and Wettern, J. H., “Quality Control in Factory-Made 
Precast Conerete,” Civil Engineering and Public Works Review, Vol. 


` 64,1959. 


Donovan, N. C., “A Stochastic Approach to the Seismic Liquefaction 
Problem," Proc. 1st Int. Conf. on Application of Statistics and Pro 1- 
bility to Soil and Structural Engineering, Hong Kong Univ. Press, 
1972, pp. 513-535. 

Eldertori, W. P., Frequency, Curves and Corrélation, 4th Ed., Cam- 
bridge Univ. Press, "Cambridge, England, 1953. 

Eisenhart, C., T'ables of the Binomial Probability Distribution, Applied 
Mathematics Series 6, National Bureau of Standards, Washington, 
D.C., 1950. 

Feller, W., An Introduction to Probability Theory and Its Applica- 
tions, Vol. L 2nd Ed., J. Wiley and Sons, New York, 1957. 

Fisher, J. W., Frank, "x. H., Hirt, M. A., and MeNamee, M ., “Effects 
of Weldments on the Fatigue Strength of Steel Beams,” "NCHRP 
Rept. No. 102, Highway Research Board, National Research Council, 


: 1970. 


Forbes, W. S., “A Survey of Progress in House Building,” Building 
Technology and Management, Vol. 7 (4), April 1969, pp. 88-91. 
Freund, J. E., Mathematical Statistics, Prentice-Hall, Englew ood 
Cliffs, N. J., 1962. 

Galligan, W. L.,'and Snoagrass, D.. V., “Machine Stress Rated 
Lumber: Challenge to Design," Jour. of the Structural Division, 


ASCE, Vol. 96, No. ST12, December 1970. 
Gerlough, D. L., “Use of Poisson Distribution in Highway Traffic,” 


Poisson and Traffic, The Eno Foundation for Highway Traffic Con- 

trol, Saugatuck, Conn:, 1955, p. 38. 

Goldman, J. L., and Ushijima, T., “Deercase in Hurricane Winds . 
After Landfall, » Jour. of the. Structural. Division, ASCE, Vol. 100, 

No. ST1, January 1974, pp. 129-141. 

Grandage, A., "Acceptance Sampling by Varidbles,” Proceeding, 

National Conf. on Statistical Quality Control M. eihodology on Highway 

and Airfield ‘Construction, May 1966, Univ. of Virginia, Charlottes- 

ville, Virginia. - 


394 


28. 


29. 
80. 


31. 


35. 


.86. 
37. 


38. 


41, 


42. 


"REFERENCES 


“Gumbel, E. J., "Statistical Theory of Extreme Values and Some 
Practical Applications,” Applied Mathematics Series 88,. National 
Bureau of Standards, Washington, D.C. February 1954. ` 

Hald, A., Statistical Theory with Engineering Applications, J. Wiley 
and Sons, New York, 1952. 

Hardy, G. H., Littlewood, J. E., and Polya,.G., Inequalities, Cam- 
bridge Univ. Press, Cambridge, England, 1959. 

Harter, H. L., New Tables of the Incomplete Gamma Function, Ratio 
and of Percentage Points of the Chi-square and Beta Distributions, 
Aerospåce Research Laboratories, U.S. Air Force (U.S. Gov't. 
Printing Office, Washington, D.C.), 1963. 

Hazeh, A., Flood Flows, a Siudy in Freguency and Magnitude, J. 
Wiley and ‘Sons, New York, 1930. 

Heathington, K. W., and Tutt, P. R., “Traffic Volume Characteristics 
on Urban Freeway,” Transportation Engineering Jour., ASCE, Vol. 
97, TE1, February 1971, p. 108. 

Hoel, P. G., Introduction to Mathematical Renee, 3rd Ed., J. Wiley 
and Sons, N ew York, 1962. 

Hoffman, D., and Lewis, E. V., "Analysis and Interpretation of 
Full-Scale Data on Midship Bending Stresses of Dry Cargo Ships,” 


Ship Structure Committee, SSC-196, US. Coast Guard Hatrs., . 


Washington, D.C., June 1969. | 

Hognestad, E., DA Study of Combined Bending and Axial Load in 
Reinforced Concrete Members," Engineering Experiment Station 
Bulletin No. 399, Univ. of Illinois, Urbana, Ill., 1951. 

Jordan, W., Eggert, O., and Kneissl, M., Handbuch der Vermessungs- 


_ kunde, Vol. 1, J. B. Metzlersche Verlagsbuchhandlung, Stuttgart, 


1961. 
Julian, O. G:, “Synopsis of First Progress Report of Committee on 
Factors of Safety,” Jour. of the Structural Division, ASCE, Vol. 83, 
No. STA, July 1957, p. 1816. 


: Kanafani, A. K., “Location Model for Parking Facilities,” Trans- 


portation Engineering Jour., ASCE, Vol. 98, No. TEI, February 
1972, pp. 117-129. ` 

Kies, J. A., Smith, H. L., Romine, H. F., and Bernstein, M., “Frac- 
ture Testing of Weldments," ASTM Special Publ. No. 381, 1965, 
pp. 328-356. 

Kimball, B. F., “Assignment of Frequencies to a Completely Ordered 


Set of Sample Data,” Trans. American Geophysical Union, Vol. 27, 


1946, pp. 843-846. 
Kothandaraman, V., “A Probabilistic Analysis of Dissolved Oxygen- 
Biochemical Oxygen Demand Relationship in Streams," Ph.D. 


50. 


51. 


52. 


53. 


54. 


55. 


REFERENCES 395 


dissertation, Dept. of Civil Engineering, Univ. of Illinois at Urbana- 
Champaign, 1968.. 

Kothandaraman, V., and Ewing, B. B., “A Probabilistic Analysis of 
Dissolved Oxygen-Biochemical Oxygen Demand Relationship in 
Streams," Jour. Water Resources Control Federation, Part.2, February 
1969, pp. 73-90. 

Kulak, G. L., “Statistical Aspects of Strength of Connections,” 
Proc. ASCE Specialty Conf. on Safety and Reliability of Metal Struc- 
tures, November 1972, pp. 83-105. 

Lam Put, R., “Dynamic Response of a Tall Building to Random Wind 
Loads," 3rd Int. Conf. on Wind Effects on Buildings and Structures, . 
Tokyo, September 1971. 

Lambe, T. W., and Whitman, R. V., Soil Mechanics, J. Wiley and 
Sons, New York, 1969, p. 375. : 
Linsley, R. K., and Franzini, J. B., Water Resources Engineering, 
McGraw-Hill Book Company, New York, 1964, p. 68. 

Lipson, C... and Sheth, N. J., Statistical Design and Analysis of: 
Engineering Experiments, McGraw-Hill Book Company, New York, 
1973. 

Littleford, T. W., “A Comparison of Flexural Strength-Stiffness 
Relationships for Clear Wood and Structural Grades of Lumber," 
Information Report VP-X-80, Forest Products Lab., Vancouver, 
B.C., Canada, December 1967. 

Loucks, D. P., and Lynn, W. R., “Probabilistic Models for Predicting 
Stream Quality,’ ’ Water Resources Research, Vol. 2, No. 3, September, 
1966, pp. 593-605. 

Malhotra, V. M., and Zoldners, N. G., “Some Field Experience i in the 
Use of an Accelerated Method of Estimating 28-day Strength of 
Concrete,” Journal of American Concrete Institute, November 1969, 


. p. 895. 


Martin, B. V., Memmott, F. W., and Bone, A. J., “Principles and 
Techniques of Predicting Future Demand for Urban Area Trans- 
portation,” Research Report No.- 88, Dept. of Civil Engineering, 
Massachusetts Institute of Technology, Cambridge, Mass., January 
1963. d : 
Martin, J. R., Ledbetter, W. B., Ahmad, H., and Britton, S. C., 

“Nonbloated Burned Clay Aggregate Concrete,” Jour. of Materials, — 


.JMLSA, Vol. 7, No. 4, December 1972; pp. 555-563. 


Mathematical Association of America, "Introductory Statistics’ 
without Calculus,” Report of the Panel on Statistics, Committee on the E 
Undergraduate Program in Mathematics, June 1972, p. 15. 

Meadows, D. H., Meadows, D. L., Randers, J., and Behrens, W. W., 
The Limits of Growth, Universe Books, New York, 1972. 


87. 
58. 


59. 


60. 


61. 
62. 
63. 


64, 


65. 


66.. 


67. 
68. 


69. 


70. 


n. 


During Peak Periods," Transportation Science, ORSA, Vol. 4, 1970, 
pp. 409-411. 


MIL-STD-414, Sampling Procedures amd Tables for Inspection by . 


Variables for Percent Defective, Dept. of Defense, June 1957. 
MIL-STD-105D, Sampling Procedures and Tables for Inspection by 
Attributes, Dept. of Defense, April 1963. 

Mitchell, G. R., and Woodgate, R. W., “A Survey of Floor Loadings 
dn Buildings,” CIRIA Report 25, London, England, August 
Mood, A. M., and Graybill, F. A., Introduction to the Theory of 
Statistics, McGraw-Hill Book, Co., N ew York, 1963. 

Morse, W. L., “Stream "Témperatüre. Predictión Under Reduced 
T Jour. of the Hydraulics Division, ASCE, Vol. 98, HY6, Jure 
Moulton, L. K., and Schaub, J. H., Estimations of Climatic Param- 
eters for Frost Depth Predictions," Transportation Engineering Jour., 
ASCE, Vol. 95, TE4, November 1969. 

Murdock, J. W., and Kesler, C. E., “Effects of Range of Stress on 
Fatigue Strength of Plain Concrete Beams,” Jour. of American Con- 
crete Institute, Vol. 30, No. 2, August 1958. 


Nishida, Y., “A Brief Note on Compression Index of Soil," Jour. of 
` the Soil Mechanics and Foundations Div., ASCE, Vol. SM3, July 1956. 


Packman, P. F., Pearson, H. S., Owens, J. S, ‘and Marchese, G. B., 

“The Applicability ofa Fracture Mechanics—Nondestructive Testing 
Design Criteria," Technical Report, AFML-TR-68-82, Air Force 
Materials Laboratory, Wright-Patterson Air Force Base, Ohio, 
May 1968. 

Parratt, L. G., Probability and Experimental Errors in Science, J, 
Wiley and Sons, New York, 1961. 

Payne, H. J., “Freeway Traffic Control and Surveillance Model,” 
Transportation Engineering Jour., ASCE, Vol. 99, No. ‘PE4, No- 
vember 1973, pp. 767—783. 

Pearson, E. §., and Johnson, N. L., Tables of the Incomplete Beta- 
Function, 2nd, Ed., Cambridge Univ. Press, ub England, 
1968. 

Pearson, K., Tables of the Incomplete B- Function, Cambridge Univ. 
Press, Cambridge, England, 1934. 

Peck, R. B., “Sampling Methods and Laboratory Tests for Chicago 
Subway Scils,” Proceedings, Purdue Conference on Soil Mechanics 
and Its Applicatons, Lafayette, Indiana, 1940. 

Pettitt, J. H. D., “Statistical Analysis of Density Tests,” Jour. of the 


Highway Div., ASCE; Vol. No. HW2, November 1967. 


72. 


TI. 


78. 


79., s 
-. Pollution and Dissolved Oxygen in Streams," Water Resources 


82. 


"REFERENCES 397 
Proctor, R. P. M., and Paxton, H. W., “Stress Corrosion of Aluminum 
Alloy in Organic Liquids,” Jour. of Materials, Vol. 4, No. 3, Septem- 


. ber 1969, p. 747. 


Pugsley, A. G., “Structural Safety," Jour. Royal Aeronaxitical Society, 
Vol. 58, 1955. 


. Richardus, P., Project Surveying, J. Wiley and Sons, New York, 1966. 


Shull, R. D., ‘and Gloyna, E. F., “Transport of Dissolved Water in 
Rivers,” Jour. of the Sanitary Engineering Division, ASCE, Vol. 95, 
SA6, December 1969, p. 1001. f 


' Smeed, R. J., “Traffic Studies and Urban Congestion," Jour. Trans- ; 


portation Economics and Policy, Vol. 2, No. 1, 1968, pp. 38-70. 

Tang, W. H., “A Bayesian Evaluation of Information for Foundation 
Engineering Design," Proceedings, First Intl. Conf. on Application of 
Statistics and Probability to Soil and Structural Engineering, Hong 
Kong Univ. Press, September 1971, pp. 173-185. 

Tang, W. H., “Probabilistic Updating of Flaw Information,” Jour. of 
Testing and, "Evaluation, JTEVA, Vol. 1, No. 6, November 1978, 
pp. 459-467. ` 

Thayer, R. P., and Krutehkoff, R. G.,."A Stochastic Model for 


Research Center, Virginia Polytechnic Inst., Blacksburg, Virginia, 
1966. 

Todd, D. K., and Meyer, C. F., “Hydrology and Geology of the . 
Honolulu Aquifer,” Jour. of the Hydraulics Div., ASCE, Vol. 97, No. 
HY2, February 1971, p. 251. 

Viner, J. G., “Recent Developments in Roadside Crush Cushions,” 
Transportation Engineering Jour., ASCE, Vol. 98, No. TE1, Feb- 
ruary 1972, pp. 71-87. 

Voorhees, A. M., “What Happens When Metropolitan PP Meet," ` 
Jour. of the Urban Planning Division, ASCE, Vol. 92, UPI, May 
1966, p. 16. 

Ward, E. J., “Systems Approach to Choice in Transport Technology,” 
Transportation Engineering Jour., ASCE, Vol. 96, No. TE4, Novem- 
ber 1970. 

Wynn, F. H., “Shortcut Modal Split Formula,” Highway Research 
Record, No. 283, Highway Research Board, National Research 
Council, 1969. 


index 


A priori basis for probability, 22 
Accelerated strength of concrete, 310 
Acceleration, 168 ; 
gravitational acceleration, 215 
Aceepiance criterion, 128, 360, 365, 376, 
7 


Acceptance function, 365 

Acceptance gap, see Gap length 

Acceptance plan, 369 

Acceptance sampling, 16, 127, 360, 374, 
376, 378 


by attributes, 360, 376 
; by variables, 360, 369, 376 
Accidents, 23, 43, 85, 165, 355, 357 
in construction, 165 
highway accidents, 125 
impact speed in, 8 
prediction model, 355 
traffic accidents, 43, 114, 125, 329, 335, 


355 
Accuracy of estimate, 231 
accuracy of measurement, 356 
Activated sludge, 202 
Activity network, 153, 182 
Addition rule, 36, 38, 41, 47 
Aerial camera, 206 
Aggregates, 57, 59 
Aiken, 361, 376 
Airport, 152, 162, 209,213, 214, 21 
airport site, 30 : 
Air traffic, 152 
Algebra, 33 
Allen, 281 
Allowable design stress, 188 


Allowable fraction defective, 373, 374, 377 


Aluminum, 227 

American Concrete Institute (ACI), 17, 284 
Analysis of uncertainty, 200, 203, 221 
Analysis of variance, 220 

Ang; 200 : 

Angle, 258, 358 

Antenna system, 162 

Aréa, 94, 246, 247, 259, 260 
Arrangement, 386, 387 

Arrival, 214, 216, 283 

Asphalt concrete, 365 

Assembly line, 109 

Associative rule, 32 - 

Asymmetry, 94 

Atmospheric reaeration, 326 
Automobile ownership, 313 

Average, see Weighted average 

Average outgoing quality (AOQ), 


367-369, 376 
average outgoing quality limit (AOQL), 
367, 376 
Average quality criterion, 369, 372 
Axioms of probability, 36, 37, 82, 134 


Bachmann, 243 
Backup system, 124 
Bacteria, 70 
Barry, 243 
Base line, 201 
Basic relation of combinatorial analysis, 
386, 387 : 
Bayes’ theorem, 22, 56,59, 330,331. 
Bayesian approach, 255, 329, 340, 354 
continuous case, 336 
discrete case, 330 
general formulation, 336 
in sampling theory, 344 
Bayesian estimate, 331-333, 338, 339, 345- 
347, 350, 355,357 
Bayesian estimation, 329, 334, 337 
Bayesian probability, 22 
Bayesian statistical decision, 354 
Bayesian updating process, 341 
Beam, 359 
canis. 63, 142, 143, 152, 186, 205, 
laminated, 316 í 
simply-supported, 21, 24, 29, 40 
Bearing capacity, 14, 15, 23, 51, 73, 146, 


Bending moment, 143, 200, 208, 210, 356 
Benjamin, 121, 340, 341 
Bernoulli sequence, 106, 107, 109; 120, 126, 


Beta distribution, 129-133, 225, 339, 351, 


standard beta distribution, 130 
Beta function, 129, 178 

incomplete beta function, 130 

incomplete beta function ratio, 131 
Biased estimate, 226 
Bid, 9, 15, 23, 30, 66, 146 


. Binomial coefficient, 107, 387, 388 


Binomial distribution, 84, 91, 107, 109, 
224, 252, 351, 352, 361 

Binomial expansion, 388 

Binomial theorem, 112 __ 

Bivariate normal, 138, 289 

Blow.count, 59, 317, 318 $ 

Biological oxygen demand (BOD), 202, 257 


Boring, 109, 119, 120, 377 


. 399. 


ree UVEA 


Boulders, 109, 119, 120, 157, 163 
Breakdown, 154, 167 
Bridge, 23, 69, 147, 157, 160, 201, 207, 
208, 282, 348, 349 
toll bridge, 175, 323, 324, 348, 349 
Building, 165, 177, 181, 195 
apartment, 61 
frame, 150 
skyscraper, 164, 165 
Bulldozer, 19,44 
Butts, Schnepper and Evans, 302, 306 


Cable, 68, 113, 114, 203,205 
California.Bearing Ratio (CBR), 168, 254, 
364, 368 
Canal, 167 
Catch basin, 100, 211, 212 
Causal effect, 141 
causal relation, 143 
Cement, 371 d 
Central limit theorern, 189, 190, 232, 245, 
E 248, 251, 253, 261, 369 
Central value, 88, 89 
Centroidal distance, 94 
Chain, 35, 48 
Chance failure, 124 
Chance occurrence, 329 
Channel flow, 217 
Characteristic function, 96 
Characteristic largest value, 270 
Chemical plant, 75 
Chi-square distribution, 133, 173, 177, 178. 
250, 251, 274 
table of chi-square probability, 384 
Chi-square test for distribution, 261, 274- 
277, 279, 281, 283-285 - 
Classical approach to estimation, 221, 233, 
254, 255, 333, 339, 346 
Classical statistics, 329, 354° r 
Clearance, 206 
Clockwise moment, 206 
counterclockwise moment, 206 
Clopper and Pearson, 253 
Coefficient of variation (COV), 90-94, 105, 
132, 281, 328, 354,359" 
Cofferdam, 160 : 
Collectively"exhaustive-events, 30 
Column, 181, 195, 200, 212, 334 
reinforced concrete column; 284 
Combination of events, see Events 
Combinatorial analysis, 386 : 
Commutative rule, 32 N 
Compaction, 16, 168, 252, 254, 364,368, 
369, 374, 377 : 
of subgrade, 7, 65 r 
Complementary event, 26 
complementary set, see Sets 4 
Completion time, 9,38, 62, 64, 76, 18,91, 
132, 135, 147, 148, 153, 155, 160, 
. 182, 214 
Compressibility, 184 
Compression index, 309,310, 358" 
Compressive strength, 159, 317, 318 
Compressive stress, 212 
Concrete cylinder, 128, 159 
Concrete mix, 211, 377 


> 


concrete aggregate, 211 
ready-mix concrete, 377 
Concrete strength, 17, 128, 159, 212, 226, 
227,249, 255, 276, 211, 310 
Concrete structure, 359 E 
concrete column, 284 
Conditional density function, 137 
of normal variates, 139 i 
Conditiona! moment, 144 
conditional mean, 143-145, 289 
conditional variance, 143-145, 288-290, 
294-299 


conditional standard deviation, 288, 290, _ 


292, 295, 298, 300, 322, 324 
Conditional probability, 43-49, 120, 331, 


conditional probability density function, 
7 3 
conditional probability mass function, 


Confidence, 221, 234, 235, 238, 239, 242, 
243, 249, 254, 257, 260, 349, 365- 
367, 375,377 
interval, see Confidence interval 
lower confidence limit, 239, 257 
one-sided confidence limit, 238, 240, 
249,251 . 
upper confidence limit, 239,242, 251, 
252, 258 t i 
Connie interval, 133, 289, 299, 307, 


of mean, 231-236, 238, 240, 242, 255-: 
258 


in measurement theory, 244-248, 258-260 
of proportion, 252-254 
of variance, 249, 251 
Congestion of traffic, 213 g 
Conjugate distribution, 351-353, 359 
Conservative, 188 
compounded conservatism, 12 
Consistency of estimator, 220, 223 
Construction planning and management, 15 
construction activity network, 182 * 
construction company, 30 
construction labor, 135 
construction productivity, 135, 136 
construction project, 15, 64, 76, 128, 132, 
147, 151, 155, 169, 214, 377 ° 
Construction of probability Paper, 270, 274 
Consumer’s risk, 364, 370, 371 , 374-378 
Consumption of water, 204 . 
Continuous random variable, 81, 84, 94, 
337, 338 
Contract price, 364 
Contractual time, 147 
Control and standards, see Standards 
Cornell, 200 : 
Correlation, 9, 48, 140, 195,210,215 . 
correlation analysis, 286, 315, 319 
correlation coefficient, 138, 140, 143, 
pa, 195, 289, 315-319, 322, 327, 
8 ; 2 


Corrosion, 168 

Cost, 11,169, 186, 202 
capital cost, 202 
of failure, 186 


of highway acquisition, 11 
initial cost, 186 
of material and labor, 169 
Covariarice, 140, 193 
Crack, 147, 215, 341 
detection, 147, 341 
growth, 215 
propagation, 341 
Crosswalk, 118 
Crushed Tock, 74 
Culvert, 1 g 
Cumulative distribution function (CDF), 81, 
84, 86, 278, 279 
cumulative probability, 234, 251 
properties of CDF, 82 
Cusens and Wettern, 276 
Cyclic load, 230 
cyclic stress, 215 


Dam, 148, 149 
control dam, 158 
earth dam, 173, 374, 377 
Darcy-Weisbach equation, 204 
Decision making, 220 
criterion, 186 
under uncertainty, 11 
Detective item, 127, 460, 361 
Degree of belief, 
Destee of freedom, 173, 177, 236, 237, 244, 
250, 251, 274, 275, 277, 372 
Delay time, 151 
Demand for water, 73, 204 
de Morgan’s rule, 34, 35,41 — : 
Density function, see Probability density 
function 
Density of fill, 365, 369-371, 374, 377 
Dependent variable, 170, 297 . 
» Derivation of the Poisson distribution, 390 
Derivative, 82 
nth derivative, 96 137 
artial derivative, = 
Derived probability distribution, 170, 191 
derived density function, 171 : 
Design, 11, see also Engineering planning 
and design : 
criterion, 117 i 
of experiments, 220 
flood, 107, 157, 159 
life, 158 
load, 195 ; 
under uncertainty, 11 
wave height, 111 
wind velocity, 111, 157, 163 
Detectability, 342, 343 
` Detection of material defects, 341. : 
Determination of probability distribution, 
261,281 
Diameter, 215 
Diesel engine, 124, 377 
` Differential equation, 391 
Differential settlement, 51, 101, 295, 296 
Diffuse prior distribution, 338, 341, 345 
Dike, 159, 205 . 
Dimension, 
Discrete random variable, 81, 84, 94, 330 
Diseased trees, 79 


INDEX 401 


Disjoint sets, 30 
Dispersion, 7, 9, 90 
measure of, 7, 88-90 
Dissolved oxygen (DO), 7, 17, 52, 165, 235, 
238, 251, 255, 256, 300, 302, 306, 
EE 
Dissolved solid. 5 : 
Distance, 214, 243-247, 258, 260,315, 358 
distance downstream, 313,315 — 
Distribution function, see Cumulative 
distribution function 
Distributive rule, 33 
Drainage system, 71 
drainage area, 11 
drainage water, 211 1 
Drill hole, 119 
Drought, 30 
Duality relation, 35 


rthquake, 30, 69, 73, 77, 114, 148, 161, 
E 162. 168, 171,330; 341,377, 389 
intensity, 73, 121, 
occurrence, 69, 121, 161, 164, 340, 341 
Efficiency, 135, 378 
of estimator, 220, 223 
Effluent, 165 
Eisenhart, 361, 376 
Elasticity, theory of, 184 
Elderton, 133- 
Electrical power, 377 
Electronic distance measurement, 233 
electronic ranging instrument, 245 
Elevation, 29020 E 
Elongation, 
Embankment, 16, 168, 252, 254, 364, 368, 
369, 374, 375 
Emergency control system, 124 
Emergency power, 124; 377 
Empirical relation, 307, 310 
Energy consumption, 288, 322 
Energy line, 
Engineer planning ‘and design, 1, 12, 17, 
30, 354 
Engineering systemi 1,360 
uipment, 19, 
N equipment, 106, 154, 161, 
167, 359 OSN 
Erlang distribution, see Gamma distribution 
Error, 11, 15, 221, 324 
of estimation, 221, 255, 329, 332, 347- 
349 K 
of measurement, 15, 194, 325, 349, 357,- 
358 ` 
prediction or model error, 11, 324 
propagation of error, 15, 199, 245, 246 
standard error, 233, 244-247, 258, 260 
systematic error, 243 
Esopus Creek, 255 rr 
Estimation of correlation coefficient, 315 
Estimation of parameter, 254, 255, 281, 
329, 330, 337, 349, 354 
interval estimation, 133; 254 
maximum likelihood method, 222, 228- 
231, 252, 254 
method of moments, 222, 223, 254, 255. 
point estimation,.220, 254 


Estimation of proportion, 252 
Estimator, 222, 232, 236, 253, 254,329, 


Bayesian éstimator,331 
expected value of, 223 
minimum variance estimator, 229 
unbiased estimator, 223, 231 
variance of, 223 
Evaporation, 211 
Event, 19, 80 
certain event, 26, 37 
collectively exhaustive events, 30,53,56 
combination of, 27 
complementary event, 26, 37, 47 
impossible event, 26, 30 
mutually exclusive events, 20, 30, 37, 38, 
40, 41, 51, 53, 56, 80 
union and intersection of, 27, 28 
Excavation, 182 
Exeedance, 109, 112, 150, 185, 210, 340 
Expected cost, 13, 14, 186 
Expected loss, 206 
Expected value, see Mean value 
of estimator, 223 
of sample mean, 231 
Experience, 330, 332, 351, 370 
Experiment, 128, 330 
experimental outcome, 331, 337,344. 
Exponential distribution, 92, 120, 122-125, 
166, 167, 174, 204, 213, 214, 224, 
229, 269, 270, 352, 359 
shifted exponential distribution, 123 
Exponential probability paper, 269, 272, 


Extremal probability paper, 270, 274 
Pxtreorio value distribution, 133, 145, 261, 
270, 27. 


F-distribution, 133 
Factor of safety, 22, 205 
Factorial of large number, 389 
Failure, 256 
failure surface, 206 
of structure, 55 
Falsework, 377 
Family income, 313 
Fare increase, 321, 322 
Fatality, 355 : 
Fatigue, 341 . 
fatigue crack, 114 US a 
fatigue life, 4, 10, 14, 227, 230, 301, 307, 
308, 311, 315 i 
Feller, 389 
Fetch, 173 ; 
Fiil material, 364, 368, 369, 374, 377 
Finite population, 127 
Finite sample space, 23 
Fire, 75, 357, 358 
First moment, 95 
First occurrence time, see Recurrence time 
First-order approximation, 191, 197, 199, 
200, 216, 218, 245 . 
Fissure, 77 
Fixed support, 143 
fixed-end moment, 63 
Flaws, 215, 341-343" , 


flaw detection, 70, 341, 342 
in weld, 23, 70, 341, 351, 376 
Flexibility, 218 
Flood, 30, 65, 69, 71, 77, 113, 154, 157- 
159, 160, 161, 211, 212, 238; 257 
annual flood of a river, 109, 158, 160 
control dike, 159 
control system, 106, 109, 211 
Flow velocity, 204 
- Footing, 182 
column footing, 58, 146, 218 
settlement of, 50, 184, 218, 327 
Force, 147, 203, 322 " 
Foundation, 14, 51, 146,158, 212, 377 
foundation wall, 182 
Foundation engineering, 73 
Fourth central moment, 248 
Fraction ME 360, 361, 366-368, 373- 
B 
fraction defective criterion, 372-375 
Fracture, 113, 341 k $ 
fracture toughness, 10, 264-267, 279 
Freeboard, 173 
Freeway, 149 
Frequency diagram, 24, 6,8, 132, 261, 276 
cumulative frequency, 277-279 
Observed frequency, 274, 276, 277, 285 
theoretical frequency, 274, 276, 277 
Freund, 226, 236, 249 
Friction, 204 
friction factor, 204 
Frost depth, 299 
Frost duration, 324, 325 
Function of random variable, 170 
moments of, 191 ; 
multiple random variables, 174, 190, 198 
product of random variables, 183 
quotient of random variables, 183 
sum of independent normal variates, 178 
sum of Poisson random variables, 175 


Galligan and Snodgrass, 316 
Gamma distribution, 124-126, 216, 223, 
224, 351-354 
gamma density function, 125, 126, 341 
Gamma function, 126, 129 
incomplete gamma function, 126 
incomplete gamma function ratio, 127 
Gamma-Normal distribution, 352 
Gap length, 8, 166, 283, 284 
Gaussian, 152, 154, 160, 164, 166, 180; 
. 181,187, 189, 205, 207-210, 225, 
232, 235-237, 240, 242, 245, 249- 
257, 261, 263, 300, 345-351, 358, 3 
369-378; see also Normal distribution 
General function of random variable, 196 
Geodesy, 16 
geodetic engineering, 248 
geodetic measurement, 15 
geodetic station, 245 
Geometric distribution, 110-112, 125, 224 
return period, 110 
Geotechnical design, 14 
Girder, 69 A 
Goldman and Ushijima, 314 - 
Goodness-of-fit test for distribution, 261, 


274, 276, 277, 279, 281" . 
Chi-square test, 261, 274-277, 279, 281, 
4, 285 
Kote ypivav-Smimoy test, 261, 274, 277- 
279, 281 B 
Grandage, 371 
Gravel pit, 211 


' Gross national product (GNP), 288, 322 


Cambal distribution, 270 
mbel distribution 
Gumbel probability paper, 270, 273, 274 


d, 197, 248, 250, 289, 361, 376 
Had Littlewood, and Polya, 141 


‘Hazen, 262 


Heathington and Tutt, 310 
Hierarchy of operations, 33 


. Higher-order moments, 96, 198, 199 


Highway, 43, 84, 149, 150, 162, 238, 299," 
360 E 
intersection, 20, 46, 117 
interstate highway, 54 
network, 61, 66 
project, 108, 254 
repair, 
Histogram, 2-4, 6-9, 227, 243, 257, 214, 
215,284 


Hoel, 228, 229, 274 

Hognestad- Ag 3 

Housi 

Hurricane, 13, 14, 77, 163, 206, 252, 314, 
315 i 

Hydraulic analysis, 154 

Hydraulic head, 155, 204 

Hydraulic aa au 
logic design, : 

aa oein dait distribution, 127,360 


Jdentically distributed random variables, 
232, 244, er od : 

Impact pressure, . 

Imbact Speed in automobile accident, 8 

Imperfection in modeling and estimation, 
10 


Independent increment, 390 
Independent observations, 228 . 4 
Independent random variables, 231, 24 i 
365 
Inertia force, 168 — — 
Inference see Statistical inference 
Inferential method, 254 
Infinite sample space, 23 
Inflow, 73, 209, 211 
Influent, cnet a I 
t ility, F 
Inspection 16, 109, 167, 254, 341-344, 360, 
365, 366, 368, 370, 376 Š 
inspection plan, 3e 364, SUE 
itute of Traffic Engineers, 
ree i al time, 8, 166, 167, 229, 283, 284 
International Association of Chief of Police, 
18 
Intersection of events, 27, 28, 31 
Interval estimation, 221, 231, 243, 248, 254 
Intuition, 330 
Inverse function, 171, 172 


INDEX 403 
Irrigation, 211 


Joint distribution function, 134 
Joint probability, 318 : 34 
Joint probability density function, 134, 
137, 

of bivariate normals, 138 
Joint probability distribution, 134, 137 
Joint probability mass function, 134-136 
Jointly normal, 138, 288, 290, 318 

dan, Eggert, and Kneisse! 

Enea 525.334, 337, 354, 359 


Kimball, 263 n 

Kinetic energy, : 
-Smirnov (K-5) goodness-of-fit 

Mp UR 274, 277-281, 284, 285 

Kothandaraman, 285, 326 


Lambe and Whitman, 295 
Land area, 246, 247, 259 
Lead concentration, 70 
Leakage in pipeline, 27 
Least error, 287 
Least-squares method, 16528] 9 
least-square criterion, is 
least-squares estimate, 288, 290, 294, 321; 
323, 325 
‘least-squares repression, 288, 298, 307, 
321 


Left-turn, 30, 252 
left-turn bay, 117, 118 
left-turn lane, 20, 39 
Jeft-turn pocket, 61 
Tens, ie EN 
Life, 12. i 
operational life, 108, 124 
wen life, 14, 22. 158 
also Fatigue life : 
Likelihood, 109, 112, 214, 228, 331,337, 
ý 339, 356-358 
Likelihood function, 228, 229, 337-341, 
344-347 
logarithm of, 228, 230 
Limit condition, IT: 188 
Limiting process, 
Line of sight, 201 s 
Linear et key 245, 290, 297, 306 
mean and variates ads 4 
of normal variates, 
Line: aph in probability paper, 262-265, 
; a67 269, 270, 281 
Linear mods 289 Bm 
Linear prediction, 
Linear regression, 286-290, 294, 300, 303, 
306, 307, S 
Linear relationship, 140, 287, 300, 315-319, 
321, 323 
Linear trend, 290, 292, 307, 313 
Linsley and Franzini, 240 
Lipson and Sheth, 376 "d D 
Liquefaction of sind 7350 A 
i utes River, R 
Toad, 13 21, 24, 29, 42, 58, 195, 200, 203, 
205, 208, 210, 212, 230, 359 
concentrated load, 152 


404 . ‘INDEX 


dead and live load, 181 
design load, 195 E 
load test, 322, 332, 333, 338 
proof load, 59; 355, 357 
uniformly distributed load, 15 2,186 
wind load, 239, 332, 339 — 
Loading time, 68 . . 
Logarithmic normal, see Lognormal 
Logarithmic paper, 270 
Logarithmic tra: tion, 103, 315 
: 184, 186, 190, 203, 204, 212, 
216, 225-228, 230, 256-258, 265, 
267,276, 277, 285, 350, 352 
product of tögnormal variates, 184-189 
relation to normal distribution, 104 
relation of parameters to mean and 
I variance, 104, 106 : 
Lognorrnal probability paper, 265-269, 281 
Loss function, 33] 
Loucks and Lynn, 17, 365 
Lower limit, 238 
Lumber, 4, 377; see also Wood 


Machine, 108 
design of, 13 
Main descriptors, see Random variable 
Maintenance, 154 
Maifunction, 107, 108 
of machine, 63 . 
Malhotra and Zoldners, 310 
Manning equation, 217 
Mapping, 80 
Marginal probability density function, 138 
of normal variates, 139 
Marginal probability mass function, 135 
Markov chain (or Markov process), 120 
Martin, Memmott, and Bone, 313, 327 
Material defect, 341 
Material of construction, 369 
Material supply, 27 
Mathematical expectation, 88, 191, 197 
Mathematical Society of America, 47 
Mathematics of probability, 36 
Maximum likelihood estimator, see Method 
of maximum likelihood 
Meadows, Meadows, Randers, and Behrens, 
288; 322 
Mean life, 93, 124 
mean time-to-failure, 124, 283 
Mean occurrence rate, 115, 275,391 . 
Mean recurrence time, see Return period 
Mean sea level, 111, 160 
Mean-square value, 89 : 
Mean-value, 7, 88, 90, 91,94, 97, 104,131 
139, 226, 232, 233, 245, 249, 254- 
257, 264, 284-286, 290, 297,319, 
331, 337, 345:347, 350, 353; 354, 
359, 369 
of general function, 196 
of linear function, 191 
population-mean, 226 
sample mean, 226, 228 ] 
Mean-value function, 286, 287, 297, 298, 
::03. See also Regression equation; 
Regression line 


02, 153-156, 158, 


Measurement, 193, 201, 214, 243,246, 259, 
260, 325, 349, 356-358, 365, 366, 
369, 375, 317 

theory of, 119, 243, 244 

Meclianics, 142 

Median, 89, 91,93, 105,150, 151, 265, 
266, 269, 281 

Method of maximum likelihood, 222, 228, 


maximum likelihood estimator (MLE), 

222, 228-231, 252, 254, 255, 345 

Method of moments, 222, 223, 254, 255 

Military Standards (MIL-STD), 374, 376 

Miller, 308 

Mitchell and Woodgate, 195 

Mixed random variable, 82 

Modal value, see Mode 

Mode, 88, 91-93, 132, 150, 151, 282, 331, 
345, 356 

Modulus of elasticity, 4, 9,316 

Modulus.of rupture, 316 

Moisture content, 369, 371 

Moment capacity, 152 

Moment-generating function, 96 

Moment of inertia, 94 


Moments óf raridom variable, 95, 96, 223, ` 


226,254 
hi ter moments, 223, 226, 248 
Moments of functions of randóm variables, 
linear functions, 191-196 
product of independent variates, 196 
general function, 196-202 
Monocacy River, 240, 268, 269,.293, 318 
Morse, 280 i i 
Most probable value, see Mode 
Moulton and Schaub, 300, 324 
Multiple correlation, 319 
Muitiple linear regression, 286, 297, 299, 
307,313, 319, 325 ] 
Multiple random variables, 133-145 | 
Multiple stage sampling, 366, 375, 376 
Multiplication rule, 47 
Munse, 308 
Murdock and Kesler, 311 
Multinomial coefficient, 388 i 
Mutually exclusive events, see Event 


Natural hazard, 133, 164 
Navigation lock, 167 
Negative binomial distribution, 113, 125 
Negative exponential, see Exponential 
' distribution 
Network of construction activities, 182 
trucking network, 180 
Nishida, 309 - 
Noise intensity, 216 
Noise.pollution, 216 
Nondestructive test (NDT), 147, 342 
Nonlinear regression, 286, 300, 303, 315 
Nonlinear relationship, 141, 302,319 : 
nonlinear function, 245, 325, 326 
Nonlinear tiend, 300 i : 
Nonsymmetry, 88 A R 
Normal distribution, 97, 152,179, 189, 211, 
222-225, 233, 137, 244, 249, 251, 


256, 262, 264, 276, 277, 281, 284, 
289, 293, 299, 321, 345-352, 373 
bivariate normal distribution, 138, 289 
standard normal distribution, 98, 237, 
263 


table of normal probability, 235, 380 
seealso Gaussian — - 
Normal population, 234, 249,251, 345, 
349 


Normal probability paper, 262-265, 279, 

281,284, t: 
] variate, 205, 206, 209, 212, 218, 

Pme 249, 258, demos 323 " 

Normalization of probability measure, 37, 
43, 337, ed, 344, 345 

nth moment, 9 

Nuclear power plant, 109, 124 

reactor structure, 168 
Number of distinct sequences, 386 


Observations, 20, 22, 242, 252, 259, 307, 
321 : 


tional data, 219, 221, 222, 233 
ones 329, 330, 331, 333, 334, 337, 
338, 347, Pb 354, 358 
Ocean wave, 198, 2i 
Offshore structure, 13, 111, 206 
Ohio River, 255 » f 
Open channel, 
Oberating characteristic (OC) curve, 316, 
. 362-364, 376 | 
QeerkHonnl rules, seé Sets 
Optimal, 1 
Dptimal inspection plan, 361, 364 
Origin-destination (O-D) trip length, 8. 
‘Outflow, 73, 211 : 
Overdesign, 13 
Overload, Dis 
Overturning ] i 
Oxygenation rate, 285, 326, 327, 


Parameter of distribution, 88, 92, 97, 103- 
p? 106, 129, 212, 219, 220-224, 226- 

231, 252, 254-256, 309, 330, 331, 
337, 338, 344-347, 350-353, 354, 
356, 357, 359 

Parking, 8, 71, 300, 302-306 

Parratt, 243 

Partial derivative, 137 


Pavement, 49, 56, 59,.158, 299,360, 378 


airport, 12, 370. 
Payne, 314 
Pearson, 131 . 
Pearson and Johnson, 131, 133 
Pearson. system of distribution, 133 
Peck,281. isl 
Penalty function, 
Percentile valued 01, 106, 150 
Permutation, 
Photogrammetry, 16, 248, 325, 349, 350 
photogrammetric measurements, 15 
Physical process, 97, 133 
Pier, 69, 160, 168, 201, 349, 350 
Pile, 155, 157, 334 
capacity, 58, 59, 256, 338 
foundation, 58, 332 


INDEX 405. 


‘Pipe, 168, 204, 209 


Pipe How, 4935 204 

Pipeline, 27, d 

Plotting position, 262, 263, 265, 267, 269- 
271 


Point estimation, 221, 222, 228, 243 
‘point estimate, 226, 231, 244, 254-256, 
i estimator, 228, 244, 245, 337 
int estimator, A j 3 
paiison distribution, 115,116, 175, 224, 
275, 351, 352, 390 
derivation of the, ey ios 
sum of Poisson varíates, ? 
Poisson process, 114, 116-121, 123, 125, 
161-165, 167, 168, 175,176, 206, 
207, 214, 283, 335, 340, 351, 357, 
390 
assumptions, 114-115, 390 
relation fo Bernoulli sequence, 115 
Pole, 208 342 
Pollutant, B 
Pollution 52, 57,70, 72, 15, 165, 166, 
216, 376 
Pollution control, D. 360 
Pool temperature, 
Population, 204, 300, 302-304, 315, 320, 
323 " " 
density, 313° 
mean, 133, 226,231,233, 234, 238, 240, 
243, 248, 2. H d 
in sampling, 127,.220-223, 232, 244, 248, 
249, 254, 262, 265, 281, 344, 345, 
387 - ] 
standard deviation, 239 : 
variance, 133, 226, 236, 240, 248, 249. 
Possibility space, 19, 21 È 
Possible outcomes, 20:22, 29 PUR 
Posterior probability l 
postericr distribution, 337, 358, 341, 345- 
352, 354, 355, 357, 358 » 
posterior probability density function, . 
340, 344, 345, 357 l 
posterior probability mass function, 332, 
334 


Posterior statistics, 353, 354 
variance, 347. 
Power generating plant, 48, 63 
nuclear, 109 15 
Power supply; 
Precipitation, 67, 77, 51, 211, 240-242, 
` 267-269, 292, 293, 318 
Pressure, 6, 198, 204 
Prestress concrete, 367 
prestressing force, 367 
Prime mover, 124 
Prior assumptions, 330, 
ior estimate, 
Pict information, 345-347, 349-351, 354, 
58, 3. : 


Prior probability, 331, 337 
prior distribution, 336, 337, 339, 343, 
349, 350-352, 359 i 
prior probability density function, 337, 
340, 344, 357 — ] 
prior probability mass function, 332, 


, 


406 INDEX 


Prior statistics, 354 
prior mean, 347 
Probabilistic characteristics, 87, 88 
Probability; 19, 36, 233, 252 
axioms, 36, 37 
basic concepts of, 19, 360 
calculation of, 22 
conditional probability, 43 
expected probability, 52 
mathematics of, 36 
of union and intersection of events, 38, 41 
role of probability in engineering, 1 
theory of, 37 
Probability density function (PDF), 3, 82, 
85,87, 02, 97, 146-151, 224, 228, 
236, 261, 269, 282 
Probability distribution, 81,96, 219, 236, 
254, 261 
common distributions, 223, 224 
derived distributions, 170, 191 
empirical determination of distribution ` 
. models, 261, 281 
useful distributions, 97-133 
validity of distribution model, 274, 281 
Probability law, see Probability distribution 
Probability mass function (PMF), 82, 84, 
` 148-151, 224, 390 
Probability measure, 22, 36, 252 
Probability model, 220, 222, 261 
Probability of damage, 112, 164 
Probability of failure, 13, 15, 42, 48, 51, 68, 
69, 78, 146, 152, 153, 164, 181, 205, 
207, 210, 212, 333, 334, 338, 359 
Probability of survival; 14, 153, 355, 357 
Probability paper, 261, 262, 269, 270, 281, 


commercial, 263 
exponential, 269, 270, 272, 283 
general, 269 
Gumbel, 270, 273 - 
lognormal, 262, 265 
normal, 262-265, 279, 284 
Rayleigh, 282 
triangular, 282 
Probability problems, 19, 22 
Probability tables, 97, 99, 235, 237, 251, 
361, 376, 379 - 
standard normal, 99, 103, 380 
Producer’s risk, 364, 370, 371, 374-378 
Product quality, 109 
Product of random variables, 183, 190 
.of independent variates, 196 
of lognormal variates, 184-189 ` 
Productivity, 135 
Project, 91 
duration, 15, 106 
Projectile, 215 
Proof test; 59, 355, 357 
Propagation of error, 15, 245 
Proportion estimation of, 252 
confidence interval, 25329A 


Quality, 16 
assurance, 360, 369, 376, 377 
of concrete material, 17 
control, 127, 159, 364, 366, 368, 369 


Quarry, 68, 74 

Queue, 74 

Quotient of random variables, 183 
quotient of lognormal variates, 184 


Radius, 206, 259, 260 

Rainfall, 100, 106 
intensity, 2, 5, 13, 133, 255 

Rainstorm, 116, 154, 211, 275 

Rayleigh distribution, 225, 255, 282, 356 
Rayleigh probability paper, 282 

Random error, 15, 193, 214, 243 

Random occurrence, 390 

Random phenomena, analytical models of. 


Random sampling, 221, 222, 228, 231, 232, ` 
24: 


random sample, 223, 231, 252, 344 
Random variables, 7, 80, 222, 233, 252, 
5 
continuous, 81, 84, 94, 337, 338 
discrete, 81, 84, 94, 330 
functions of, 170 
main descriptor of, 87, 145, 223,254 
mixed, 82, 83 
multiple, 133 
Range, 215 
range of random variable, 80 
Rapid transit, 148, 162, 321 
Rare event, 112 
Reactor containment structure, 367 
Reaction, 21, 24, 29 
Reaeration process, 326 
Real line, 80 
real space, 133 
Real world, 220, 222, 254 
Receiver, 361 
Reconstituted sample space, 43, 45; see 
also Conditional probabili ity 
Recurrence time, 110, 121, 167 
Reduction factor, 195, 359 
Reduction of variance, 289 
References, 392-397 
Regression, 9 
multiple regression, 297, 313 
nonlinear regression, 300 
Regression analysis, 286, 288, 290, 293, 
294, 297, 300, 303, 306, 307, 309, 
313, 315, 319, 322 


. applications in engineering, 307 


with constant variance, 286 
to determine empirical relation, 307 . 
with nonconstant variance, 294 
to validate theoretical equation, 309 
Regression coefficients, 288, 294, 298 
confidence intervals of, 289, 299 
Regression equation, 290, 293, 294, 296, 
298, 303, 307, 310, 316, 317, 320, 
321, 322 
Regression line, 11, 288, 289, 292-294, 
300, 309, 320, 322- 324 
Regression of normal variates, 289 
Reinforced concrete, 284, 359 
Rejection, 371 
Reale frequency, 7, 20, 22, 37, 39, 136, 
P : 


Relative likelihood, 330 
Reliability, 14, 50, 78, 153, 160, 185, 366, 
3 


77 
theory of, 124 
Remote sensing, 16, 79 
Repair, 67, 147, 154, 158, 167, 341, 342, 
344 
Repeated loads, 14 
Repeated trials, see Bernoulli sequence 
Research and development, 78 
Reservoir, 67, 73, 74, 76, 149,207, 211 
reservoir dam, 69 
Resistance, 150 
Retaining wall, 68. 
Return period, 157, 158, 166, 205, 257, 
26: 
of Bernoulli sequence, 110, 112 
of Poisson process, 121 
Richardus, 195 


' Ridership, 321, 322 


Right turn, 30 

Ring, 259, 260 

Risk, 13, 15, 160, 210, 341, 364, 365, 370, 
313,378 


consumer's and producer's risks, 364, 
370, 371, 374-378 
permissible risk, 212 
risk-benefit analysis, 1 
River, 201, 257 
flow, 313, 315 
Road, 147, 509, 329, 335, 365 
Road 'grader, 108 i 
Rock quarry, 68, 74 
crushed rock, j4 
rock stratum, 155 
Roughness coeffi icient, 217 


` Runoff, 13, 133, 154, 240, 241, 267-269, 


292, 293, 318 


“Runway, 152 


Safety, 206,332 
of building, 181 
factor, 188 
level, 187, 188 
margin, 15 
measure, 165 
Sample, 156 
data, 220, 253, 354, 37 
measurements, 243 
Sample of the population, 222 
Sample size, 223, 226, 232, 234-236, 238- 
240, 242, 243, 251-254, 277, 278, ` 
299, 348, 360, 369, 370, 372-374, 
376, 387 
Sample point, 23 
Sample space, 23, 80, 133 
continuous, 23, 80 
discrete, 23, 80, 386, 387 
finite, 23 
reconstituted, 43, 45 
Sample statistics, 329, 369 
sample mean, 7, 226, 228, 231-236, 238- 
240, 256, 257, 260, 277, 297, 315, 
345-350, 369-373, 375 
sample moment, 223, 549, 254 
sample standard ' deviation, 7, 238-240, 


INDEX 407 
256, 258, 315, 346, 350, 358, 372, 
373 


sample variance, 226; 228, 236, 240, 248, 
256, 257, 260, 277, 318, 320 
Sampling by attributes, 360, 375, 376 
Sampling by variables, 360, 369, 375, 376 
Sampling plan, 127, 361, 367; 369, 371, 
373-378 
acceptance sampling, 360 
multiple stage sampling, 366, 375 
sequential sampling, 366, 375 
Sampling theory, 222, 344, 345, 354; see 
also Random sampling. . 
Sampling with replacement, 387 
Sampling without replacement, 387, 388 
Sand, 328, 377 
saturated sand, 73, 230 y 
Scatter, 7,9, 141, 294, 295, 315 
scattergram, 9, 294 
School cross-walk, 118 
Schwarz's inequality, 141 
Second moment, 96 
joint second moment, 140 
Second-order approximation, 197,199, 
200, 216, 218 
Seismic region, 69,114. 
Sequence, 386 
ordered sequence, 386 
Service station, 162 
Sets, theory of, 22 
complementary set, 32 
equality of sets, 31 
operational rule, 31 
subsets, 23, 24, 30 
Settlement, 51, 101, 102,146, 295, 296 
of bridge supports, 60 
differential settlement, 51, 101, 102, 
295,296 : 
excessive settlement, 51 
of footings, 50, 184, 218, 327 
Sewer, 157, 211 
sewer network, 209 
sewer system, 154 
Shear force, 143, 
shear stress in soil, 7, 281 
Shell structure, 101 
Shifted exponential distribution, see 
Exponential distribution 
Shortage of material, 76 
shortage of concrete, 27 
shortage of gas, 162 
‘shortage of steel, 27 
shortage of water, 164 
Shull and Gloyna, 313 $ 
Significance level, in Chi-square test, 274, 
275, 277, 283-285 
in Kolmogorov-Smimor test, 278-280, 
2 
Simulation, 349 
Simuitaneous equations, 298 
Skewness measure, 94 
skewness coefficient, 94, 132 
Slip surface, 77 
Slope, 206, 217 


- Sludge, 202 


Smeed, 309 


408 INDEX 


S-N relation, 307, 308,311 
Soft lenses in soil’ deposit, 377 
Soil deposit, 119 
soil stratum, 157, 256, 358 
Solid waste, 210, 378 
Specification, 159 
Speed, 242, 325, 326 
speed -traffic density relationship, 314, 315 
Squared deviation, 89 
Squared error, 287, 294, 297 
Standards, 16, 252; 255, 360 
of acceptance, 16, 360 
pollution control standard, 109 
of stream quality, 17, 165, 365, 366 
Standard beta distribution, 130 
Standard deviation, 89-92, "94, 97, 232, 233, 
245, 246, 256- 259, 264; 284, 285, 297, 
299, 320, 321,345, 347, 348, 356, 358, 


"Standard error, 233, 244-247, 258-260, 349, 
50, 358 


Standard normal distribution, 98, 237, 263 
table of standard normal probability, 380 
Standard normal variate, 98, 171, 177, » 233, 
234, 238, 250, 263; 265 ; 
Standard value, 369, 374 
standard mean- -value, 370-372 
Standard variate, 265, 269, 273, 282 
Statistical estimation, see Estimation 
Statistical independence, 46, 48, 84, 107, 110, 
112, 120, 135, 137, 140, 143, 144,175, 
177- 179, "181, 182, "184, 193- 196, 199, 
` 206, 207, 209- 212; 215; 216, 218, 232, 
244; 334, 355, 371,3 390 
Statistical inference, 219, /220, 222 
Statistical method, 39" 
Statistics, 10, 16 
Steel, 239 
reinforcement, 257, 281, 285 
Stirling's formula, 385 
Stopping distance, 325, 326 
Storm, 55, 154, 162, 240 
Sewers, 154, 157, 309, 211 
Strain energy, 172 
Strain gage, 128 
Stream, 211, 326, 375 
flow, 313; 315, 350 
quality, 235, 365, 366, 375 
probabilistic stream standard, 17, 365 
temperature, 280 
Strength of material, 152, 238 
compressive strength of. clay, 281, 290 
of concrete, 17, 128, 159, 212, 226, 227, 
233, 249, 255, 216, 310, 315, 377 
flexural, 327 
of nonbloated burned clay aggregate concrete, 


‘shear strength, of clay, 281, 290 
of fillet welds, 3 

yield strength of reinforcing bars, 3,257,258 
Stress, 212 

bending stress, 4 

Cyclic stress, 215 

extreme fiber stress, 186 

shear stress, 7, 281 
` Stress increment, 215 


stress range, 282, 307, 308 
Stress-strain curve, 7323 
Strike, 76, 161 
Structural damage, 75, 112 
structural failure, 55 
Structure reliability, 181 
SI 161, 205, 256, 341, 342, 355, 
design of, $5. 359 
structural component, 70 
superstructure, 214 . 
temporary structure, 157, 339 
Student's tdistribution, see t-distribution 
Subdivision, 71, 75 
Subgrade, 156 
compaction, 65, 66 
density, 7 
Success run, 366 
Sufficiency “of estimator, 220, 223 
Sulphur dioxide, 376, 377 
Sum of random variabies, 189 
sum of independent Poisson variates, 175 
sum (and difference) of normal varjates, 
179, 180, 232 
sum of squares, 226, 250 
Supersonic transport, 52 
Supplier, 361 
Surveying, 15, 243, 245, 248, 349, 350, 
357, 358 


Suspended solid, 202 
Symmetry, 94, $9 
Systematic error, 15, 243 


distribution, 133, 236-239, 244, 289, 372 
table of, 383 


Table of a-percentile values of the x? 
distribution, 384 

Table of critical values of D& in the. 
Kolmogorov-Smirnov t test, 385 

Table of p-percentile values of the t- 
distribution, 383. y 

Table EE standard normal probability, 380, 


Tables of probability, see Probability table 
Tang, 341, 345, 348 

Tank, 206, 295° 

Target time, 15,78 

Taylor series, 197, 198 

Telierometer, 245 


` Temperature, 299, 300, 302, 306 


freezing temperatere, 23 
of stream, 280 

Tendon, 168, 367 

Terrestrial elevation, 325 

Test results, 233, 239, 330, 356, 369 
proof test, 355, 357 


. Testing validity of distribution, 274 


Tharp model, 355 
Thayer and Krutchoff, 321 
Theoretical! models, 146 
Third central moment, see Skewness 
measure 
Tide, 174 
wind tide, 173. 
Time and space problem, 109, 110,114 


ime-to-failure, 124, 283, 359 
Tolerable fraction defective, 374, 376 
Tornado, i» EE Un 3123120 
al probability theorem, 52, " ie 
in 135, 138, 331; 332, 348, 390 
Trade iis, 12, 14, 15 
rade offs, 12, 
Traffic, 84, 150, 161, 167, 169, 207, 209, 
229, 242, 252, 309, 329, 334, 348 
accidents, 43, 114, 125, 329, 335, 355 
congestion, 152 i 
count, 20 
engineering, 310, 329, 335 
highway traffic, 54, 150 


volume, 118, 149, 308-311, 323, 324 
Transformation, 233, 286 * 
Transitional probability, 73 
"Transmission tower, 110 
radio tower, 258, 259 
Transportation, modes of, 28, 72, 73,148, 
209 
Travel times, 8, 61, 148, 149, 180, 205, 
209, 213, 321, 358 
Trials, 106, 114, 120; 252 
Triangular distribution, 133, 147, 148, 151, 
158, 225, 281, 357 
: triangular probability paper, 282 
Triangulation, 16 
triangulated elevation, 325 
Triaxial specimen, 230 
Trilateration, 16 
Trip distance, 312, 313, 315 
trip generation, 313 
trip time, 209; see also Travel times - 
‘Truss, 42, 1 
Tunnel, 
Turbidity; 319, 320 
Typhoons, 6 


Ultimate load, 284, 359 
Ultimate strain; 281, 285 
Ultrasonics, 49, 56,342 
Unbiased estimator, 220, 223, 231, 248, 
288, 295, 298 ` 
Unbaised sample variance, 226, 248 
Uncertainty, 1, 200, 203, 218, 221, 238, 
329, 347,348 ` 
inherent. randomness, 3, 221, 332, 354 
in real world information, 3 
from modeling and estimation, 10, 221, 
329,332, 338, 354 
Underdesign, 14 
Underground water, 207 
Uniform distribution, 86, 133, 214, 225, 
“262, 338, 341, 357 : 
Uniform flow, 217° 
Unimodal, 89" - 
Union of events, 27, 28, 31 
University of. Illinois, 284 
Updating process, 341, 354. 
updatirig probability, 58 
Upper limit, 238,-357 


Variability, 3,11, 89,90, 315, 330, ‘332, 347 
inherent variability, 22 1 


AUDA uy 


observed in histograms, 3 
Variance, 89-94, 97, 104, 131, 139, 145, 

149-151, 226, 231, 233, 246, 248, 
251-254, 257, 284, 290, 293, 294, 
298, 299, 313, 319- 320, 348, 353 

approximate variance, 197 

conditional variance, 143, 144, 288, 289, 
316 

confidence interval of, 248-252 

of estimator, 223 


Te of general function, 191, 196 


of linear function, 191 
population variance, 226, 228 
sample variance, 226, 228, 248 
of sum of random variables, 492, 193, 195 
Variate, 97° 
Vehicle. speed; 242, 314, 315, 326 
relation with Weri "distance, 326 
Velocity, 204, 215, 2 
Venn diagram, 26, 59, 40, 44, 53 
of intersection of events, 28, 29 
of union of events, 28, 29 
Void ratio, 309, 310 
Volume, 215 
Voorhees, 3T2 


Waiting time, 74, 148 

Walker, 28: 

Waste ad 64, ,210, 212, 355, 378° 
waste water, 202 

Water supply, 36, 63; 67, 76, 164, 207 
distribution system, 155 
pipe system, 70, 168 
water consumption, 67, 164, 320, 321 

Water. level, 62, 63, 75 

Water quality, 109, 360 

Water tower, 77 

Watershed; 5, 13, 67,71,211 

Wave, 111^ 
height, 4,160, 173, 203, 255 
pressure, 204 ur 
velocity, 198, 204 

Weather, 214 

Weighted average, 88, 89,226 

Welding machine, 92 

Welds, 70, 114, 147, 266; 267, 343, 376 
cracks i in, 147, 341 
flaws in, 23, 70, 342, 34, 351. 
shear stren; of, 

Wind, 6, 61, ano 162, 164, 174, 252, one 
load, 339; 7339 
profile, 314 
tide, 173 Ey 
velocity, 156, 163,203, 256, 258, 281, 
'- 314, 315, 339° 

Wire, 113 


. "Wood, structural grade, 156, 377 


modulus, of elasticity, 316 
of rupture, 316 
wood beam design, 186 
Work duration, 135 
Wotkmanship, 351, 376° 
Wynn, 302, 304 


Yield strength of steel, 3, 239, 257. 
Yield strength of wood, i86 - 
Young's modulus, 323 


26.2.21 


pdf) mm (2) = (bot bat? + bya! + Beat + bat? + bor) Hel) 
[e(z)|<2.3 107+ 
b9= 2.50523 67 b,— .13064 69 
b,—1.28312 04 ^  b,——.02024 90 


b 22647 18 ^ bi=  .00391 32 
! 26.2.19 z 


{ 


| 
Coes) a Pe-1-5 G- hera det 


dat dg dg) 4 

le(z)«1.5x107 
d,—.04986 73470 ^ d,—.00003 80036 
d,—.02114 10061 d5—.00004 88906 
d,—.00327 76263 ^ d,—.00000 53830 


26.2.50 Z(z)— (asT- aca aye age)" ela) 
fe(z)|<2.7 10-8 
Gy =2.490895 a= — 024393 
@2= 1.466003 Q= .178257 


7 Based on approximations in C. Hastings, J 

Sacr I ns in C. Hastings, Jr., Áp| 

Bee for digital computers. Princeton Univ. 
Tinceton, N.J., 1955 (with permission). 


241 


i "i = i 


TABLE 7. CRITICAL ABSOLUTE VALUES OF CORRELATION 
COEFFICIENT f* 


596 points and 1% points (io boldface) for equal-tails test of hypothesis p = 0. 


“Total number of variables Total number of variables 
2 3 4 5 4 5 
423 562 
Wes et 


e | 77 mas 9 4 
334 zaa PIO AIT 
7 less 758 -807 «B38 |! 
EL Ass 335 994 i 
E: 28 777 en 7— 3g | 
S j 5S am ise saz a NEUES 
e | s e c9 788 |! ME ss 1 
TES 0 338 300 | 40 $04 353 Ae 455 
ol@ mmm) ol BR # 
ae a uf 31 As e e 
u 53 8 100 l 5e 413 3316 319 412 
AC 3055 ORAS 1) [ iss4 10 Aie o — 09 
1: a532 027 883 722 
^| Ri iH 38 m| 39 25 id im 
3 | ft pe 38 asho 7 | He o8 odas 
14 497 590 HB 
5 an E $n ped up 383 E 
E 3 k 54 2B 
fee an mo oir s jas Si de 
4l 274 
ELLE LH 
216 248 
245 2M 
198  .225 
S44 210 
72  .196 
Suo 54 
A41 160 
cr med 
22  .139 
St e 
09  .124 
. ass 56 
O77 088 4d 
79s 66 IS: 


e. Sixüshcs Manu , EU Gee eb 
Dewr, Nf, ideo 


241 


COEFFICIENT r* 


: 
d 
ž 
ô 
E 
i 
< 
= 


s |a- lasag saa agagag|a9 2g ag ce agassa 29 ag lacie case i 
| CEITLELLCIEELRLIEUULIOEI EIS a 
E EEE A EEEE n 
i CLLIIIEEITUETELLIEEEE: if 
i susasiessjsesss " È 

E don $e 
j ge 42 | 85 03 28 #5 23 ea 28 28 88 G8 EET 
s Sg BEES ER 8z| 28 G8 98 35 2 ELE 
| EET GE 22 BH EE 88 cE SETIR 
i EEEIEE aay 
i naaoevnlo Bees iz sgg m 


SI METRIC UNITS 


inches (in.) 
inches (in.) 
inches (in.) 


feet (ft) . 
yards (yd) 
miles (miles) 


degrees (°) 
„acres (acre) 
acre-feet, (acre-ft) 


gallons (gal) 
gallons (gal) 


pounds (Ib) 
tons (ton, 2000 Ib) 


pound force (Ibf) 
pounds per sq in. 
(psi) 
pounds per sq ft. - 
(psf) ' 
foot-pounds (ft-lb) _ 
horsepowers (hp) 
British thermal units 
(BTU) . 
British thermal units 
(BTU) 


CONVERSION FACTORS 


meters (m) 
ceritimeters (em) 
millimeters (mm) 


meters (m) 
meters (m) 
kilometers (km). 


radians (rad) 


hectares (ha) 
cubic meters (m5) 
cubic meters (m?) 


liters (1) 


kilograms (kg) 

kilograms (kg) 

newtons (N) ` 

newtons per sq m 
(N/m?) f 

newtons per sq m 
(N/m?) 

joules (J) 

watts (W) 

joules (J) 


kilowatt-hours (kwh) 


Customary to SI 


0.0254 . 
2.54 
25.4 


0.305 
. 0.914 
1.609 


0.0174 


0.405 
1283 
3.79 X 10^ 
3.79 


0.4536 
907.2 ‘ 


4.448: 
6895 


` 47.88 


2.93 X 10~* 


DEFINITIONS 


newton—force that will give a 1-kg mass an acceleration of 1 m/sec? 
joule—work done by a force:of 1 N over a displacement of 1 m 

1 newton per sq in. (N/m?) = 1 pascal ` 

1 kilogram force (kgf) = 9.807 N 

1 gravity acceleration (g) = 9.807 m/sec? 

1 are (a) = 100 m? ; 

1 hectare (ha) — 10,000 m* 

1 kip (kip) = 1000 Ib 


